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NUCLEIC ACID SEQUENCES TO PROTEINS INVOLVED IN TOCOPHEROL 

SYNTHESIS 

5 

INTRODUCTION 

This application claims the benefit of the filing date of the provisional Application 
U.S. Serial Number 60/129,899, filed April 15, 1999, and the provisional Application, U.S. 
10 Serial Number 60/146,461, filed July 30, 1999. 

TECHNICAL FIELD 

The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto. 

15 

BACKGROUND 

Isoprenoids are ubiquitous compounds found in all living organisms. Plants synthesize 
a diverse array of greater than 22,000 isoprenoids (Connolly and Hill (1992) Dictionary of 
Terpenoids, Chapman and Hall, New York, NY). In plants, isoprenoids play essential roles in 
2 0 particular cell functions such as production of sterols, contributing to eukaryotic membrane 
architecture, acyclic polyprenoids found in the side chain of ubiquinone and plastoquinone, 
growth regulators like abscisic acid, gibberellins, brassinosteroids or the photosynthetic 
pigments chlorophylls and carotenoids. Although the physiological role of other plant 
isoprenoids is less evident, like that of the vast array of secondary metabolites, some are 

2 5 known to play key roles mediating the adaptative responses to different environmental 

challenges. In spite of the remarkable diversity of structure and function, all isoprenoids 
originate from a single metabolic precursor, isopentenyl diphosphate (IPP) (Wright, (1961) 
AnniL Rev, Biochem. 20:525-548; and Spurgeon and Porter, (1981) in Biosynthesis of 
Isoprenoid Compounds .. Porter and Spurgeon eds (John Wiley, New York) Vol. 1, ppl-46). 

3 0 A number of unique and interconnected biochemical pathways derived from the 

isoprenoid pathway leading to secondary metabolites, including tocopherols, exist in 
chloroplasts of higher plants. Tocopherols not only perform vital functions in plants, but are 
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also important from mammalian nutritional perspectives. In plastids, tocopherols account for 
up to 40% of the total quinone pool. 

Tocopherols and tocotrienols (unsaturated tocopherol derivatives) are v^ell known 
antioxidants, and play an important role in protecting cells from free radical damage, and in 
5 the prevention of many diseases, including cardiac disease, cancer, cataracts, retinopathy, 
Alzheimer's disease, andneurodegeneration, and have been shown to have beneficial effects 
on symptoms of arthritis, and in anti-aging. Vitamin E is used in chicken feed for improving 
the shelf life, appearance, flavor, and oxidative stability of meat, and to transfer tocols from 
feed to eggs. Vitamin E has been shown to be essential for normal reproduction, improves 

1 0 overall performance, and enhances immunocompetence in livestock animals. Vitamin E 
supplement in animal feed also imparts oxidative stability to milk products. 

The demand for natural tocopherols as supplements has been steadily growing at a 
rate of 10-20% for the past three years. At present, the demand exceeds the supply for natural 
tocopherols, which are known to be more biopotent than racemic mixtures of synthetically 

15 produced tocopherols. Naturally occurring tocopherols are all c?-stereomers, whereas 

synthetic a-tocopherol is a mixture of eight J,/-a-tocopherol isomers, only one of which 
(12.5%) is identical to the natural <i-a-tocopheroL Natural rf-a-tocopherol has the highest 
vitamin E activity (1.49 lU/mg) when compared to other natural tocopherols or tocotrienols. 
The synthetic a-tocopherol has a vitamin E activity of LI lU/mg. In 1995, the worldwide 

2 0 market for raw refined tocopherols was $1020 million; synthetic materials comprised 85-88% 
of the market, the remaining 12-15% being natural materials. The best sources of natural 
tocopherols and tocotrienols are vegetable oils and grain products. Currently, most of the 
natural Vitamin E is produced from y-tocopherol derived from soy oil processing, which is 
subsequently converted to a-tocopherol by chemical modification (a-tocopherol exhibits the 

2 5 greatest biological activity). 

Methods of enhancing the levels of tocopherols and tocotrienols in plants, especially 
levels of the more desirable compounds that can be used directly, without chemical 
modification, would be useful to the art as such molecules exhibit better functionality and 
biovailability. 

3 0 In addition, methods for the increased production of other isoprenoid derived compounds 

in a host plant cell is desirable. Furthermore, methods for the production of particular isoprenoid 
compounds in a host plant cell is also needed. 

2 
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SUMMARY OF THE INVENTION 

5 The present invention is directed to prenyltransferase (PT), and in particular to PT 

polynucleotides and polypeptides. The polynucleotides and polypeptides of the present 
invention include those derived from prokaryotic and eukaryotic sources. 

Thus, one aspect of the present invention relates to isolated polynucleotide sequences 
encoding prenyltransferase proteins. In particular, isolated nucleic acid sequences encoding 
10 PT proteins from bacterial and plant sources are provided. 

Another aspect of the present invention relates to oligonucleotides which include 
partial or complete PT encoding sequences. 

It is also an aspect of the present invention to provide recombinant DNA constmcts 
which can be used for transcription or transcription and translation (expression) of 
15 prenyltransferase. In particular, constructs are provided which are capable of transcription or 
transcription and translation in host cells. 

In another aspect of the present invention, methods are provided for production of 
prenyltransferase in a host cell or progeny thereof. In particular, host cells are transformed or 
transfected with a DNA construct which can be used for transcription or transcription and 
2 0 translation of prenyltransferase. The recombinant cells which contain prenyltransferase are 
also part of the present invention. 

In ia further aspect, the present invention relates to methods of using polynucleotide 
and poljrpeptide sequences to modify the tocopherol content of host cells, particularly in host 
plant cells. Plant cells having such a modified tocopherol content are also contemplated 
25 herein. 

The modified plants, seeds and oils obtained by the expression of the 
prenyltransferases are also considered part of the invention. 



30 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides an amino acid sequence alignment between ATPT2, ATPT3, 
ATPT4, ATPT8, and ATPT12 are performed using ClustalW. 



3 
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Figure 2 provides a schematic picture of the expression construct pCGN 10800. 

Figure 3 provides a schematic picture of the expression construct pCGN 10801. 

Figure 4 provides a schematic picture of the expression construct pCGN 10803. 

Figure 5 provides a schematic picture of the expression construct pCGN 10806. 
5 Figure 6 provides a schematic picture of the expression construct pCGN 10807. 

Figure 7 provides a schematic picture of the expression construct pCGN 10808. 

Figure 8 provides a schematic picture of the expression construct pCGN 10809. 

Figure 9 provides a schematic picture of the expression construct pCGN10810. 

Figure 10 provides a schematic picture of the expression construct pCGNlOSl 1. 
10 Figure 1 1 provides a schematic picture of the expression construct pCGNlOS 12. 

Figure 12 provides a schematic picture of the expression construct pCGN10813. 

Figure 13 provides a schematic picture of the expression construct pCGN10814. 

Figure 14 provides a schematic picture of the expression construct pCGN10815. 

Figure 15 provides a schematic picture of the expression construct pCGN10816. 
15 Figure 16 provides a schematic picture of the expression construct pCGN10817. 

Figure 17 provides a schematic picture of the expression construct pCGN10819. 

Figure 18 provides a schematic picture of the expression constmct pCGN 10824, 

Figure 19 provides a schematic picture of the expression constmct pCGN 10825. 

Figure 20 provides a schematic picture of the expression construct pCGN 10826. 
20 Figure 21 provides an amino acid sequence alignment using ClustalW between the 

Synechocystis sequence knockouts. 

Figure 22 provides an amino acid sequence of the ATPT2, ATPT3, ATPT4, ATPT8, 
and ATFT12 protein sequences ftom Arabidopsis and the slrl736, sir0926, slll899, slr0056,- 
and the slrlS 1 8 amino acid sequences from Synechocystis, 

2 5 Figure 23 provides the results of the enzymatic assay from preparations of wild type 

Synechocystis strain 6803, and Synechocystis slrl736 knockout. 

Figure 24 provides bar graphs of HPLC data obtained from seed extracts of transgenic 
Arabidopsis containing pCGN 10822, which provides of the expression of the ATPT2 
sequence, in the sense orientation, from the napin promoter. Provided are graphs for alpha, 

3 0 gamma, and delta tocopherols, as well as total tocopherol for 22 transformed lines, as well as 

a nontransformed (wildtype) control. 

Figure 25 provides a bar graph of HPLC analysis of seed extracts from Arabidopsis 
plants transformed with pCGN 10803 (35S-ATPT2, in the antisense orientation), pCGN10802 
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(line 1625, napin ATPT2 in the sense orientation), pCGN10809 (line 1627, 35S-ATPT3 in 
the sense orientation), a nontrans formed (wt) constrol, and a empty vector transformed 
control. 

5 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides, inter alia, compositions and methods for altering (for 
example, increasing and decreasing) the tocopherol levels and/or modulating their ratios in 

10 host cells. In particular, the present invention provides polynucleotides, polypeptides, and 
methods of use thereof for the modulation of tocopherol content in host plant cells. 

The present invention provides polynucleotide and polypeptide sequences involved in 
the prenylation of straight chain and aromatic compounds. Straight chain prenyl transferases 
as used herein comprises sequences which encode proteins involved in the prenylation of 

15 straight chain compounds, including, but not limited to, geranyl geranyl pyrophosphate and 
famesyl pyrophosphate. Aromatic prenyl transferases, as used herein, comprises sequences 
which encode proteins involved in the prenylation of aromatic compounds, including, but not 
limited to, menaquinone, ubiquinone, chlorophyll, and homogentisic acid. The prenyl 
transferase of the present invention preferably prenylates homogentisic acid. 

20 The biosynthesis of a-tocopherol in higher plants involves condensation of 

homogentisic acid and phytylpyrophosphate to form 2-methyl-6phytylbenzoquinol that can, 
by cyclization and subsequent methylations (Fiedler et al., 1982, Planta, 155: 511-515, Soil 
et al., 1980, Arch. Biochem. Biophys. 204: 544-550, Marshall et al., 1985 Phytochem,, 24: 
1705-17 11, all of which are herein incorporated by reference in their entirety), form various 

25 tocopherols. The Arabidopsis pds2 mutant identified and characterized by Norris et al, 
(1995), is deficient in tocopherol and plastiquinone-9 accumulation. Further genetic and 
biochemical analysis suggests that the protein encoded by PDS2 may be responsible for the 
prenylation of homogentisic acid. This may be a rate limiting step in tocopherol biosynthesis, 
and this gene has yet to be isolated. Thus, it is an aspect of the present invention to provide 

30 polynucleotides and polypeptides involved in the prenylation of homogentisic acid. 

Isolated Polynucleotides, Proteins, and Polypeptides 

5 
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A first aspect of the present invention relates to isolated prenyltransferase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 
polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
5 sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to 
each coding sequence as set forth in the Sequence Listing. The invention also provides the 
coding sequence for the mature polypeptide or a fragment thereof, as well as the coding 

1 0 sequence for the mature polypeptide or a fragment thereof in a reading frame with other 
coding sequences, such as those encoding a leader or secretory sequence, a pre-, pro-, or 
prepro- protein sequence. The polynucleotide can also include non-coding sequences, 
including for example, but not limited to, non-coding 5' and 3' sequences, such as the 
transcribed, untranslated sequences, termination signals, ribosome binding sites, sequences 

15 that stabilize mRNA, introns, polyadenylation signals, and additional coding sequence that 

encodes additional amino acids. For example, a marker sequence can be included to facilitate 
the purification of the fused polypeptide. Polynucleotides of the present invention also 
include polynucleotides comprising a structural gene and the naturally associated sequences 
that control gene expression. 

2 0 The invention also includes polynucleotides of the formula: 

X-(Ri)n-(R2)-(R3)n-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, and R3 
are any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and ! 
1000 and R2 is a nucleic acid sequence of the invention, particularly a nucleic acTd sequence 

2 5 selected from the group set forth in the Sequence Listing and preferably those of SEQ ID 

NOs: 1, 3, 5, 7, 8, 10, 1 1, 13-16, 18, 23, 29, 36, and 38. In the formula, R2 is oriented so that 
its 5' end residue is at the left, bound to Ri, and its 3' end residue is at the right, bound to R3. 
Any stretch of nucleic acid residues denoted by either R group, where R is greater than 1, may 
be either a heteropolymer or a homopolymer, preferably a heteropolymer. 

3 0 The invention also relates to variants of the polynucleotides described herein that 

encode for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 
invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 

6 
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5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the 
invention are substituted, added or deleted, in any combination. Particularly preferred are 
substitutions, additions, and deletions that are silent such that they do not alter the properties 
or activities of the polynucleotide or polypeptide. 
5 Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 

identical over their entire length to a polynucleotide encoding a polypeptide of the invention, 
and polynucleotides that are complementary to such polynucleotides. More preferable are 
polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that are 
10 complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
those with at least 97% identity are highly preferred and those with at least 98% and 99% 
identity are particularly highly preferred, with those at least 99% being the most highly 
preferred. 

15 Preferred embodiments are polynucleotides that encode polypeptides that retain 

substantially the same biological function or activity as the mature polypeptides encoded by 
the polynucleotides set forth in the Sequence Listing. 

The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under 

2 0 stringent conditions to the above-described polynucleotides. As used herein, the terms 

"stringent conditions" and "stringent hybridization conditions" mean that hybridization will 
generally occur if there is at least 95% and preferably at least 97% identity between the 
sequences. An example of stringent hybridization conditions is overnight incubation at 42°C 
in a solution comprising 50% formamide, 5x SSC (150 mM NaCl, 15 mM trisodium citrate), 

2 5 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 

micrograms/milliliter denatured, sheared salmon sperm DNA, followed by washing the 
hybridization support in 0. Ix SSC at approximately 65''C. Other hybridization and wash 
conditions are well known and are exemplified in Sambrook, et al,^ Molecular Cloning: A 
Laboratory Manual, Second Edition, cold Spring Harbor, NY (1989), particularly Chapter 11. 

3 0 The invention also provides a polynucleotide consisting essentially of a 

polynucleotide sequence obtainable by screening an appropriate library containing the 
complete gene for a polynucleotide sequence set for in the Sequence Listing under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide sequence or 

7 
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a fragment thereof; and isolating said polynucleotide sequence. Fragments useful for 
obtaining such a polynucleotide include, for example, probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for example, 
polynucleotides of the invention can be used as a hybridization probe for RNA, cDNA, or 
5 genomic DN A to isolate full length cDN As or genomic clones encoding a polypeptide and to 
isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 
15 bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 
Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 

10 The coding region of each gene that comprises or is comprised by a polynucleotide 

sequence set forth in the Sequence Listing may be isolated by screening using a DNA 
sequence provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then 
used to screen a library of cDNA, genomic DNA or mRNA to identify members of the library 

15 which hybridize to the probe. For example, synthetic oligonucleotides are prepared which 

correspond to the prenyltransferase EST sequences. The oligonucleotides are used as primers 
in polymerase chain reaction (PCR) techniques to obtain 5' and 3' terminal sequence of 
prenyl transferase genes. Alternatively, where oligonucleotides of low degeneracy can be 
prepared from particular prenyltransferase peptides, such probes may be used directly to 

2 0 screen gene libraries for prenyltransferase gene sequences. In particular, screening of cDN A 
libraries in phage vectors is useful in such methods due to lower levels of background 
hybridization. 

Typically, a prenyltransferase sequence obtainable from the use of nucleic acid probes 
will show 60-70% sequence identity between the target prenyltransferase sequence and the 

2 5 encoding sequence used as a probe. However, lengthy sequences with as little as 50-60% 

sequence identity may also be obtained. The nucleic acid probes may be a lengthy fragment 
of the nucleic acid sequence, or may also be a shorter, oligonucleotide probe. When longer 
nucleic acid fragments are employed as probes (greater than about 100 bp), one may screen at 
lower stringencies in order to obtain sequences from the target sample which have 20-50% 

3 0 deviation (i.e., 50-80% sequence homology) from the sequences used as probe. 

Oligonucleotide probes can be considerably shorter than the entire nucleic acid sequence 
encoding an prenyltransferase enzyme, but should be at least about 10, preferably at least 
about 15, and more preferably at least about 20 nucleotides. A higher degree of sequence 

8 
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identity is desired when shoner regions are used as opposed to longer regions. It may thus be 
desirable to identify regions of highly conserved amino acid sequence to design 
oligonucleotide probes for detecting and recovering other related prenyltransferase genes. 
Shorter probes are often particularly useful for polymerase chain reactions (PGR), especially 
5 when highly conserved sequences can be identified. {See, Gould, et aL^ PNAS USA (1989) 
86:1934-1938.). 

Another aspect of the present invention relates to prenyltransferase polypeptides. 
Such polypeptides include isolated polypeptides set forth in the Sequence Listing, as well as 
polypeptides and fragments thereof, particularly those polypeptides which exhibit 

10 prenyltransferase activity and also those polypeptides which have at least 50%, 60% or 70% 
identity, preferably at least 80% identity, more preferably at least 90% identity, and most 
preferably at least 95% identity to a polypeptide sequence selected from the group of 
sequences set forth in the Sequence Listing, and also include portions of such polypeptides, 
wherein such portion of the polypeptide preferably includes at least 30 amino acids and more 

15 preferably includes at least 50 amino acids. 

"Identity", as is well understood in the art, is a relationship between two or more 
polypeptide sequences or two or more polynucleotide sequences, as detennined by comparing 
the sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 

2 0 sequences. "Identity" can be readily calculated by known methods including, but not limited 

to, those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University 
Press, New York (1988); Biocomputing: Informatics and Genome Projects, Smith, D.W., ed.. 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part /, Griffin, ' 
A.M. and Griffin, H.G,, eds., Humana Press, New Jersey (1994); Sequence Analysis in 
25 Molecular Biology, von Heinje, G., Academic Press (1987); Sequence Analysis Primer, 

Gribskov, M. and Devereux, J., eds., Stockton Press, New York (1991); and Carillo, H., and 
Lipman, D., SIAM J Applied Math, 48:1073 (1988). Methods to determine identity are 
designed to give the largest match between the sequences tested. Moreover, methods to 
determine identity are codified in publicly available programs. Computer programs which 

3 0 can be used to determine identity between two sequences include, but are not limited to, GCG 

(Devereux, J., et al.. Nucleic Acids Research 12(1):387 (1984); suite of five BLAST 
programs, three designed for nucleotide sequences queries (BLASTN, BLASTX, and 
TBLASTX) and two designed for protein sequence queries (BLASTP and TBLASTN) 

9 
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(Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren, et aL, Genome Analysis, 1: 
543-559 (1997)). The BLAST X program is publicly available from NCBI and other sources 
{BLAST ManuaL Altschul, S., et aL, NCBI NLM NIH,Bethesda, MD 20894; Altschul, S., et 
al, 7. Mol BioL, 215:403-410 (1990)). The well known Smith Waterman algorithm can also 
5 be used to determine identity. 

Parameters for polypeptide sequence comparison typically include the following: 
Algorithm: Needleman and Wunsch, J, Mol Biol 48:443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc, Natl Acad, 
Set USA 89:10915-10919 (1992) 
10 Gap Penalty: 12 

Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the "gap" 
program from Genetics Computer Group, Madison Wisconsin. The above parameters along 
with no penalty for end gap are the default parameters for peptide comparisons. 
15 Parameters for polynucleotide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. MoL Biol. 48:443-453 (1970) 

Comparison matrix: matches = +10; mismatches = 0 

Gap Penalty: 50 

Gap Length Penalty: 3 

20 A program which can be used with these parameters is publicly available as the "gap" 

program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 
default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: I 

" X-(Ri)n-(R2)-(R3)n-Y " ^ - ' 

25 wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or 
a metal, R] and R3 are any amino acid residue, n is an integer between 1 and 1000, and R2 is 
an amino acid sequence of the invention, particularly an amino acid sequence selected from 
the group set forth in the Sequence Listing and preferably those encoded by the sequences 
provided in SEQ ID NOs: 2, 4, 6, 9, 12, 17, 19-22, 24-28, 30, 32-35, 37, and 39. In the 

3 0 formula, Ro is oriented so that its amino terminal residue is at the left, bound to Ri, and its 
carboxy terminal residue is at the right, bound to R3. Any stretch of amino acid residues 
denoted by either R group, where R is greater than 1, may be either aheteropolymer or a 
homopolymer, preferably a heteropolymer. 
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Polypeptides of the present invention include isolated polypeptides encoded by a 
polynucleotide comprising a sequence selected from the group of a sequence contained in the 
Sequence Listing set forth herein . 

The polypeptides of the present invention can be mature protein or can be part of a 
5 fusion protein. 

Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has an amino acid sequence that is 
entirely the same as part but not all of the amino acid sequence of the previously described 
polypeptides. The fragments can be "free-standing" or comprised within a larger polypeptide 

10 of which the fragment forms a part or a region, most preferably as a single continuous region. 
Preferred fragments are biologically active fragments which are those fragments that mediate 
activities of the polypeptides of the invention, including those with similar activity or 
improved activity or with a decreased activity. Also included are those fragments that 
antigenic or immunogenic in an animal, particularly a human. 

15 Variants of the polypeptide also include polypeptides that vary from the sequences set 

forth in the Sequence Listing by conservative amino acid substitutions, substitution of a 
residue by another with like characteristics. In general, such substitutions are among Ala, 
Val, Leu and He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between 
Lys and Arg; or between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 

2 0 to 5; 1 to 3 or one amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, these 
variants can be used as intermediates for producing the full-length polypeptides of the ' 
invention. " . . . — 

2 5 The polynucleotides and polypeptides of the invention can be used, for example, in 

the transformation of host cells, such as plant host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the 
mature polypeptide (for example, when the mature form of the protein has more than one 

3 0 polypeptide chain). Such sequences can, for example, play a role in the processing of a 

protein from a precursor to a mature form, allow protein transport, shorten or lengthen protein 
half-life, or facilitate manipulation of the protein in assays or production. It is contemplated 
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that cellular enzymes can be used to remove any additional amino acids from the mature 
protein. 

A precursor protein, having the mature form of the polypeptide fused to one or more 
prosequences may be an inactive form of the polypeptide. The inactive precursors generally 
5 are activated when the prosequences are removed. Some or all of the prosequences may be 
removed prior to activation. Such precursor protein are generally called proproteins. 

Plant Constructs and Methods of Use 

10 Of particular interest is the use of the nucleotide sequences in recombinant DN A 

constmcts to direct the transcription or transcription and translation (expression) of the 
prenyltransferase sequences of the present invention in a host plant cell. The expression 
constmcts generally comprise a promoter functional in a host plant cell operably linked to a 
nucleic acid sequence encoding a prenyltransferase of the present invention and a 

15 transcriptional termination region functional in a host plant cell. 

A first nucleic acid sequence is "operably linked" or "operably associated" with a 
second nucleic acid sequence when the sequences are so arranged that the first nucleic acid 
sequence affects the function of the second nucleic-acid sequence. Preferably, the two 
sequences are part of a single contiguous nucleic acid molecule and more preferably are 

2 0 adjacent. For example, a promoter is operably linked to a gene if the promoter regulates or 
mediates transcription of the gene in a cell. 

Those skilled in the art will recognize that there are a number of promoters which are 
functional in plant cells, and have been described in the literature. Chloroplast and plastid ' 
specific promoters, chloroplast or plastid functional promoters, and chloroplkst or plastid 

2 5 operable promoters are also envisioned. 

One set of plant functional promoters are constitutive promoters such as the 
CaMV35S or FMV35S promoters that yield high levels of expression in most plant organs. 
Enhanced or duplicated versions of the CaMV35S and FMV35S promoters are useful in the 
practice of this invention (Odell, et al. (1985) Nature 313:810-812; Rogers, U.S. Patent 

3 0 Number 5,378, 619). In addition, it may also be preferred to bring about expression of the 

prenyltransferase gene in specific tissues of the plant, such as leaf, stem, root, tuber, seed, 
fruit, etc., and the promoter chosen should have the desired tissue and developmental 
specificity. 

12 
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Of particular interest is the expression of tiie nucleic acid sequences of the present 
invention from transcription initiation regions which are preferentially expressed in a plant 
seed tissue. Examples of such seed preferential transcription initiation sequences include 
those sequences derived from sequences encoding plant storage protein genes or from genes 
5 involved in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' 
regulatory regions from such genes as napin (Kridl et aL, Seed ScL Res, 7:209:219 (1991)), 
phaseolin, zein, soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit 
of P-conglycinin (soy 7s, (Chen et aL, Proc, Natl Acad, Sci.^ 83:8560-8564 (1986))) and 
oleosin. 

10 It may be advantageous to direct the localization of proteins conferring 

prenyltransferase to a particulair subcellular compartment, for example, to the mitochondrion, 
endoplasmic reticulum, vacuoles, chloroplast or other plastidic compartment. For example, 
where the genes of interest of the present invention will be targeted to plastids, such as 
chloroplasts, for expression, the constructs will also employ the use of sequences to direct the 

15 gene to the plastid. Such sequences are referred to herein as chloroplast transit peptides 
(CTP) or plastid transit peptides (FTP). In this manner, where the gene of interest is not 
directly inserted into the plastid, the expression construct will additionally contain a gene 
encoding a transit peptide to direct the gene of interest to the plastid. The chloroplast transit 
peptides may be derived from the gene of interest, or may be derived from a heterologous 

2 0 sequence having a CTP. Such transit peptides are known in the art. See, for example. Von 
Heijne et al. (1991) Plant Mol Biol, Rep. 9: 104-126; Clark et al, (1989) J. Biol, Chem. 
264:17544-17550; della-Cioppa et al. (1987) Plant Physiol, 54:965-968; Romer et al. (1993) 
Biochem, Biophys. Res Commun. 796:1414-1421; and. Shah et al. (19&6) Science 25i:478- 
481. 

2 5 Depending upon the intended use, the constructs may contain the nucleic acid 

sequence which encodes the entire prenyltransferase protein, or a portion thereof. For 
example, where antisense inhibition of a given prenyltransferase protein is desired, the entire 
prenyltransferase sequence is not required. Furthermore, where prenyltransferase sequences 
used in constructs are intended for use as probes, it may be advantageous to prepare 

3 0 constmcts containing only a particular portion of a prenyltransferase encoding sequence, for 

example a sequence which is discovered to encode a highly conserved prenyltransferase 
region. 

13 
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The skilled artisan will recognize that there are various methods for the inhibition of 
expression of endogenous sequences in a host cell. Such methods include, but are not limited 
to, antisense suppression (Smith, et al (1988) Nature 334:724-726) , co-suppression (Napoli, 
etal (1989) P/anrCe// 2:279-289), ribozymes (PCT Publication WO 97/10328), and 
5 combinations of sense and antisense Waterhouse, et al, (1998) Proc. Natl Acad. ScL USA 
95: 13959-13964. Methods for the suppression of endogenous sequences in a host cell 
typically employ the transcription or transcription and translation of at least a portion of the 
sequence to be suppressed. Such sequences may be homologous to coding as well as non- 
coding regions of the endogenous sequence. 
10 Regulatory transcript termination regions may be provided in plant expression 

constmcts of this invention as well. Transcript termination regions may be provided by the 
DNA sequence encoding the prenyltransferase or a convenient transcription termination 
region derived from a different gene source, for example, the transcript termination region 
which is naturally associated with the transcript initiation region. The skilled artisan will 
15 recognize that any convenient transcript termination region which is capable of terminating 
transcription in a plant cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
prenyltransferase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, inSvab, et al. (1990) 
20 Proc, Natl Acad. Sci. USA 87:8526-8530 and Svab and Maliga (1993) Proc. Natl Acad. ScL 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

The prenyltransferase constructs of the present invention can be used in 
transformation methods with additional constructs providing for the expression of other 
nucleic acid sequences encoding proteins involved in the production of tocopherols, or 
2 5 tocopherol precursors such as homogentisic acid and/or phytylpyrophosphate. Nucleic acid 
sequences encoding proteins involved in the production of homogentisic acid are known in 
the art, and include but not are limited to, 4-hydroxyphenylpyruvatedioxygenase (HPPD, EC 
1.13.1 1.27) described for example, by Garcia, et al ((1999) Plant Physiol 1 19(4): 1507- 
1516), mono or bifunctional tyrA (described for example by Xia, et al (1992) J, Gen 
30 Microbiol 138:1309-1316, and Hudson, etal (1984)7. Mol Biol 180:1023-1051), 

Oxygenase, 4-hydroxyphenylpyruvatedi- (9CI), 4-Hydroxyphenylpyruvatedioxygenase; p- 
Hydroxyphenylpyruvatedioxygenase; p-Hydroxyphenylpyruvate hydroxylase; p- 
Hydroxyphenylpyruvate oxidase; p-Hydroxyphenylpyruvic acid hydroxylase; p- 
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Hydroxyphenylpyruvic hydroxylase; piiydroxyphenylpyru vie oxidase), 4- 
hydroxyphenylacetate, NAD(P)H:oxygen oxidoreductase (1-hydroxylating); 4- 
hydroxyphenylacetate 1 -monooxygenase, and die like. In addition, constructs for the 
expression of nucleic acid sequences encoding proteins involved in the production of 
5 phytyipyrophosphate can also be employed with the prenyltransferase constructs of the 

present invention. Nucleic acid sequences encoding proteins involved in the production of 
phytyipyrophosphate are known in the art, and include, but are not limited to 
geranylgeranylpyrophosphate synthase (GGPPS), geranylgeranylpyrophosphate reductase 
(GGH), l-deoxyxylulose-5-phosphate synthase, 1- deoxy-D-xylolose-5-phosphate 

1 0 reductoisomerase, 4-diphosphocytidyl-2-C-methylerythritol synthase, isopentyl 
pyrophosphate isomerase. 

The prenyltransferase sequences of the present invention find use in the preparation of 
transformation constructs having a second expression cassette for the expression of additional 
sequences involved in tocopherol biosynthesis. Additional tocopherol biosynthesis 

15 sequences of interest in the present invention include, but are not limited to gamma-tocpherol 
methyltransferase (Shintani, et al (1998) Science 282(5396):2098-2100), tocopherol cyclase, 
and tocopherol methyltransferase. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, 

2 0 transfected, or transgenic. A transgenic or transformed cell or plant also includes progeny of 
the cell or plant and progeny produced from a breeding program employing such a transgenic 
plant as a parent in a cross and exhibiting an altered phenotype resulting from the presence of 
a prenyltransferase nucleic acid sequence. • 
Plant expression or transcription constructs having a prenyltransferase as the DNA 

2 5 sequence of interest for increased or decreased expression thereof may be employed with a 
wide variety of plant life, particularly, plant life involved in the production of vegetable oils 
for edible and industrial uses. Particularly preferred plants for use in the methods of the 
present invention include, but are not limited to: Acacia, alfalfa, aneth, apple, apricot, 
artichoke, arugula, asparagus, avocado, banana, barley, beans, beet, blackberry, blueberry, 

30 broccoli, bmssels sprouts, cabbage, canola, cantaloupe, carrot, cassava, cauliflower, celery, 
cherry, chicory, cilantro, citrus, Clementines, coffee, com, cotton, cucumber, Douglas fir, 
eggplant, endive, escarole, eucalyptus, fennel, figs, garlic, gourd, grape, grapefruit, honey 
dew, jicama, kiwifruit, lettuce, leeks, lemon, lime. Loblolly pine, mango, melon, mushroom, 

15 
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nectarine, nut, oat, oil palm, oil seed rape, okra, onion, orange, an ornamental plant, papaya, 
parsley, pea, peach, peanut, pear, pepper, persimmon, pine, pineapple, plantain, plum, 
pomegranate, poplar, potato, pumpkin, quince, radiata pine, radicchio, radish, raspberry, rice, 
rye, sorghum. Southern pine, soybean, spinach, squash, strawberry, sugarbeet, sugarcane, 
5 sunflower, sweet potato, sweetgum, tangerine, tea, tobacco, tomato, triticale, turf, turnip, a 
vine, watermelon, wheat, yams, and zucchini. 

Most especially preferred are temperate oilseed crops. Temperate oilseed crops of interest 
include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), sunflower, 
safflower, cotton, soybean, peanut, coconut and oil palms, and com. Depending on the 

1 0 method for introducing the recombinant constructs into the host cell, other DN A sequences 
may be required. Importantly, this invention is applicable to dicotyledyons and 
monocotyledons species alike and will be readily applicable to new and/or improved 
transformation and regulation techniques. 

Of particular interest, is the use of prenyltransferase constmcts in plants to produce 

15 plants or plant parts, including, but not limited to leaves, stems, roots, reproductive, and seed, 
with a modified content of tocopherols in plant parts having transformed plant cells. 

For immunological screening, antibodies to the protein can be prepared by injecting 
rabbits or mice with the purified protein or portion thereof, such methods of preparing 
antibodies being well known to those in the art. Either monoclonal or polyclonal antibodies 

2 0 can be produced, although typically polyclonal antibodies are more useful for gene isolation. 
Western analysis may be conducted to determine that a related protein is present in a crude 
extract of the desired plant species, as determined by cross-reaction with the antibodies to the 
encoded proteins. When cross-reactivity is observed, genes encoding the related proteins are 
isolated by screening expression libraries representing the desired plant species. Expression 

2 5 libraries can be constructed in a variety of commercially available vectors, including lambda 
gtl 1, as described in Sambrook, et al. {Molecular Cloning: A Laboratory Manual, Second 
Edition (1989) Cold Spring Harbor Laboratory, Cold Spring Harbor, New York). 

To confirm the activity and specificity of the proteins encoded by the identified 
nucleic acid sequences as prenyltransferase enzymes, in vitro assays are performed in insect 

30 cell cultures using baculovirus expression systems. Such baculovims expression systems are 
known in the art and are described by Lee, et al U.S. Patent Number 5,348,886, the entirety 
of which is herein incorporated by reference. 
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In addition, other expression constructs may be prepared to assay for protein activity 
utilizing different expression systems. Such expression constructs are transformed into yeast 
or prokaryotic host and assayed for prenyltransferase activity. Such expression systems are 
known in the art and are readily available through commercial sources. 
5 In addition to the sequences described in the present invention, DNA coding sequences 

useful in the present invention can be derived from algae, fungi, bacteria, manmialian 
sources, plants, etc. Homology searches in existing databases using signature sequences 
corresponding to conserved nucleotide and amino acid sequences of prenyltransferase can be 
employed to isolate equivalent, related genes from other sources such as plants and 

10 microorganisms. Searches in EST databases can also be employed. Furthermore, the use of 
DNA sequences encoding enzymes functionally enzymatically equivalent to those disclosed 
herein, wherein such DNA sequences are degenerate equivalents of the nucleic acid 
sequences disclosed herein in accordance with the degeneracy of the genetic code, is also 
encompassed by the present invention. Demonstration of the fimctionality of coding 

15 sequences identified by any of these methods can be carried out by complementation of 

mutants of appropriate organisms, such as Synechocystis, Shewanella, yeast, Pseudomonas, 
Rhodobacteria, etc., that lack specific biochemical reactions, or that have been mutated. The 
sequences of the DNA coding regions can be optimized by gene resyn thesis, based on codon 
usage, for maximum expression in particular hosts. 

2 0 For the alteration of tocopherol production in a host cell, a second expression 

construct can be used in accordance with the present invention. For example, the 
prenyltransferase expression construct can be introduced into a host cell in conjunction with 
a second expression construct having a nucleotide sequence for a protein involved in ! 
tocopherol biosynthesis. ~^ 
25 The method of transformation in obtaining such transgenic plants is not critical to the 

instant invention, and various methods of plant transformation are currently available. 
Furthermore, as newer methods become available to transform crops, they may also be 
directly applied hereunder. For example, many plant species naturally susceptible to 
Agrobacterium infection may be successfully transformed via tripartite or binary vector 

3 0 methods of Agrobacterium mediated transformation. In many instances, it will be desirable 

to have the construct bordered on one or both sides by T-DNA, particularly having the left 
and right borders, more particularly the right border. This is particularly useful when the 
construct uses A. tumefaciens or A. rhizogenes as a mode for transformation, although the T- 
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DNA borders may find use with other modes of transformation. In addition, techniques of 
microinjection, DNA particle bombardment, and electroporation have been developed which 
allow for the transformation of various monocot and dicot plant species. 

Normally, included with the DNA construct will be a structural gene having the 
5 necessary regulatory regions for expression in a host and providing for selection of 

transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, 
heavy metal, toxin, etc., complementation providing prototrophy to an auxotrophic host, viral 
immunity or the like. Depending upon the number of different host species the expression 
construct or components thereof are introduced, one or more markers may be employed, 

10 where different conditions for selection are used for the different hosts. 

Where Agrobacterium is used for plant cell transformation, a vector may be used 
which may be introduced into the Agrobacterium host for homologous recombination with T- 
DNA or the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed (capable of causing gall formation) 

15 or disarmed (incapable of causing gall formation), the latter being permissible, so long as the 
vir genes are present in the transformed Agrobacterium host. The armed plasmid can give a 
mixture of normal plant cells and gall. 

In some instances where Agrobacterium is used as the vehicle for transforming host 
plant cells, the expression or transcription construct bordered by the T-DNA border region(s) 

2 0 will be inserted into a broad host range vector capable of replication in E. coli and 

Agrobacterium, there being broad host range vectors described in the literature. Commonly 
used is pRK2 or derivatives thereof. See, for example, Ditta, et aL, (JProc, Nat. Acad. Sci., 
U,S,A. (1980) 77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. 
Alternatively, one may insert the sequences to be expressed in plant "cells into a vector 

2 5 containing separate replication sequences, one of which stabilizes the vector in E. coli, and 

the other in Agrobacterium. See, for example, McBride, et al (Plant MoL Biol. (1990) 
74:269-276), wherein the pRiHRI (Jouanin, etaL, MoL Gen. Genet. (1985) 201:370-374) 
origin of replication is utilized and provides for added stability of the plant expression vectors 
in host Agrobacterium cells. 

3 0 Included with the expression construct and the T-DNA will be one or more markers, 

which allow for selection of transformed Agrobacterium and transformed plant cells. A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The 
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particular marker employed is not essential to this invention, one or another marker being 
preferred depending on the particular host and the manner of constmction. 

For transformation of plant cells using Agrobacterium, explants may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the 
5 bacteria killed, and the plant cells cultured in an appropriate selective medium. Once callus 
forms, shoot formation can be encouraged by employing the appropriate plant hormones in 
accordance with known methods and the shoots transferred to rooting medium for 
regeneration of plants. The plants may then be grown to seed and the seed used to establish 
repetitive generations and for isolation of vegetable oils. 

1 0 There are several possible ways to obtain the plant cells of this invention which 

contain multiple expression constructs. Any means for producing a plant comprising a 
construct having a DNA sequence encoding the expression construct of the present invention, 
and at least one other construct having another DNA sequence encoding an enzyme are 
encompassed by the present invention. For example, the expression construct can be used to 

1 5 transform a plant at the same time as the second construct either by inclusion of both 

expression constructs in a single transformation vector or by using separate vectors, each of 
which express desired genes. The second construct can be introduced into a plant which has 
already been transformed with the prenyltransferase expression construct, or alternatively, 
transformed plants, one expressing the prenyltransferase construct and one expressing the 

2 0 second construct, can be crossed to bring the constructs together in the same plant. 

The nucleic acid sequences of the present invention can be used in constructs to 
provide for the expression of the sequence in a variety of host cells, both prokaryotic 
eukaryotic. Host cells of the present invention preferably include monocotyledenous and ' 
dicotyledenous plant cells. 
25 In general, the skilled artisan is familiar with the standard resource materials which 

describe specific conditions and procedures for the construction, manipulation and isolation 
of macromolecules (e.g., DNA molecules, plasmids, etc.), generation of recombinant 
organisms and the screening and isolating of clones, (see for example, Sambrook et ai, 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (1989); Maliga et al, 

3 0 Methods in Plant Molecular Biology, Cold Spring Harbor Press (1995), the entirety of which 

is herein incorporated by reference; Birren et ai, Genome Analysis: Analyzing DNA, 1, Cold 
Spring Harbor, New York, the entirety of which is herein incorporated by reference). 
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Methods for the expression of sequences in insect host cells are known in the art. 
Baculovirus expression vectors are recombinant insect viruses in which the coding sequence 
for a chosen foreign gene has been inserted behind a baculovirus promoter in place of the 
viral gene, e.g., polyhedrin (Smith and Summers, U.S. Pat. No., 4,745,051, the entirety of 
5 which is incorporated herein by reference). Baculovims expression vectors are known in the 
art, and are described for example in Doerfler, Curr. Top, Microbiol ImmunoL 757:51-68 
(1968); Luckow and Sunmiers, Bio/Technology 5:47-55 (1988a); Miller, Annual Review of 
Microbiol 42:177-199 (1988); Sunmiers, Curr, Comm. Molecular Biology, Cold Spring 
Harbor Press, Cold Spring Harbor, N.Y. (1988); Summers and Smith, A Manual of Methods 
10 for Baculovirus Vectors and Insect Cell Culture Procedures, Texas Ag. Exper. Station 
Bulletin No. 1555 (1988), the entireties of which is herein incorporated by reference) 

Methods for the expression of a nucleic acid sequence of interest in a fungal host cell 
are known in the art. The fungal host cell may, for example, be a yeast cell or a filamentous 
fungal cell. Methods for the expression of DNA sequences of interest in yeast cells are 
15 generally described in "Guide to yeast genetics and molecular biology", Guthrie and Fink, 
eds. Methods in enzymology , Academic Press, Inc. Vol 194 (1991) and Gene expression . 
technology", Goeddel ed. Methods in Enzymology, Academic Press, Inc., Vol 185 (1991). 

Mammalian cell lines available as hosts for expression are known in the art and 
include many immortalized cell lines available from the American Type Culture Collection 

2 0 (ATCC, Manassas, VA), such as HeLa cells, Chinese hamster ovary (CHO) cells, baby 

hamster kidney (BHK) cells and a number of other cell lines. Suitable promoters for 
mammalian cells are also known in the art and include, but are not limited to, viral promoters 
such as that from Simian Virus 40 (SV40) (Fiers et al. Nature 273:1 13 (1978), the entirety 'of 
which is herein incorporated by reference), Rous sarcoma virus (RSV), adenovirus (ADV) 
25 and bovine papilloma virus (BPV). Mammalian cells may also require terminator sequences 
and poly-A addition sequences. Enhancer sequences which increase expression may also be 
included and sequences which promote amplification of the gene may also be desirable (for 
example methotrexate resistance genes). 

Vectors suitable for replication in mammalian cells are well known in the art, and may 

3 0 include viral replicons, or sequences which insure integration of the appropriate sequences 

encoding epitopes into the host genome. Plasmid vectors that greatly facilitate the 
construction of recombinant viruses have been described (see, for example, Mackett et al, J 
Virol 49:857 (1984); Chakrabarti et al, Mol Cell Biol 5:3403 (1985); Moss, In: Gene 
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Transfer Vectors For Mammalian Cells (Miller and Calos, eds.. Cold Spring Harbor 
Laboratory, N.Y., p. 10, (1987); all of which are herein incorporated by reference in their 
entirety). 

The invention now being generally described, it will be more readily understood by 
5 reference to the following examples which are included for purposes of illustration only and 
are not intended to limit the present invention. 

EXAMPLES 

10 

Example 1: Identification of Prenyltransferase Sequences 

PSI-BLAST (Altschul, et al. (1997) Nuc Acid Res 25:3389-3402) profiles were 
generated for both the straight chain and aromatic classes of prenyltransferases. To generate 

15 the straight chain profile, a prenyl- transferase from Porphyra purpurea (Genbank accession 
1709766) was used as a query against the NCBI non-redundant protein database. The£. cqli 
enzyme involved in the formation of ubiquinone, ubiA (genbank accession 1790473) was 
used as a starting sequence to generate the aromatic prenyltreuisferase profile. These profiles 
were used to search public and proprietary DNA and protein data bases. JnArabidopsis seven 

2 0 putative prenyltransferases of the straight-chain class were identified, ATPTl, (SEQ ID 
NO:9), ATPT7 (SEQ ID NO: 10), ATPT8 (SEQ ID NO: 1 1), ATPT9 (SEQ ID NO: 13), 
ATPTIO (SEQ ID NO: 14), ATPTl 1 (SEQ ID NO: 15), and ATPT12 (SEQ ID NO: 16) and 
five were identified of the aromatic class, ATPT2 (SEQ ID NO: I), ATPT3 (SEQ ID NO:3),- 
ATPT4 (SEQ ID NO:5), ATPT5 (SEQ ID NO:7), ATPT6 (SEQ ID"N0:8). Additional ' 

2 5 prenyltransferase sequences from other plants related to the aromatic class of 

prenyltransferases, such as soy (SEQ IDNOs: 19-23, the deduced amino acid sequence of 
SEQ ID NO:23 is provided in SEQ ID NO:24) and maize (SEQ ID NOs:25-29, and 31) are 
also identified. The deduced amino acid sequence of ZMPT5 (SEQ ID NO:29) is provided in 
SEQ ID NO:30. 

3 0 Searches are performed on a Silicon Graphics Unix computer using additional 

Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This software 
and hardware enables the use of the Smith-Waterman algorithm in searching DNA and 
protein databases using profiles as queries. The program used to query protein databases is 
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profilesearch. This is a search where the query is not a single sequence but a profile based on 
a multiple alignment of amino acid or nucleic acid sequences. The profile is used to query a 
sequence data set, i.e., a sequence database. The profile contains all the pertinent information 
for scoring each position in a sequence, in effect replacing the "scoring matrix" used for the 
5 standard query searches. The program used to query nucleotide databases with a protein 
profile is tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 
profile query. As the search is running, sequences in the database are translated to amino acid 
sequences in six reading frames. The output file for tprofilesearch is identical to the output 
file for profilesearch except for an additional colunm that indicates the frame in which the 

1 0 best alignment occurred. 

The Smith- Waterman algorithm, (Smith and Waterman (1981) supra), is used to 
search for similarities between one sequence from the query and a group of sequences 
contained in the database. E score values as well as other sequence information, such as 
conserved peptide sequences are used to identify related sequences. 

15 To obtain the entire coding region corresponding to the Arabidopsis prenyltransferase 

sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
partial cDNA clones containing prenyltransferase sequences. Primers are designed according 
to the respective Arabidopsis prenyltransferase sequences and used in Rapid Amplification of 
cDNA Ends (RACE) reactions (Frohman et al (1988) Proc. Natl Acad, ScL USA 85:8998- 

2 0 9002) using the Marathon cDNA amplification kit (Clontech Laboratories Inc, Palo Alto, 
CA). 

Additional BLAST searches are performed using the ATPT2 sequence, a sequence in 
the class of aromatic prenyl transferases. Additional sequences are identified in soybean • 
libraries that are similar to the ATPT2 sequence. The additional soybean sequence 
2 5 demonstrates 80% identity and 91% similarity at the amino acid sequence. 

Amino acid sequence alignments between ATPT2 (SEQ ID NO:2), ATPT3 (SEQ ID 
NO:4), ATPT4 (SEQ ID NO:6), ATPT8 (SEQ ID NO: 12), and ATPT12 (SEQ ID NO: 17) are 
performed using ClustalW (Figure 1), and the percent identity and similarities are provided in 
Table 1 below. 
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Table 1: 

|~ ATPT2 ATPT3 ATPT4 ATPT8 ATPT12 ' 
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ATPT2 % Identity 
% similar 
% Gap 




12 


13 


11 


15 




25 


25 


22 


32 




17 


20 


20 


9 


ATPT3 % Identity 
% similar 
% Gap 






12 


6 


22 






29 


16 


38 






20 


24 


14 


ATPT4 % Identity 
% similar 
% Gap 








9 


14 








18 


29 








26 


19 


ATPT8 % Identity 
% similar 
% Gap 










7 










19 










20 


ATPT12 % Identity 
% similar 
% Gap 

































Example 2: Preparation of Expression Constructs 

5 A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 

5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
more useful for cloning large DNA fragments containing multiple restriction sites, and to ' , 
allow the cloning of-multiple napin fusion genes into plant binary transform^tion^ectorsv-. An 
adapter comprised of the self annealed oligonucleotide of sequence 

10 CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAA 
AT (SEQ ID NO:40) was ligated into the cloning vector pBC SK+ (Stratagene) after 
digestion with the restriction endonuclease BssHII to construct vector pCGN7765. Plamids 
pCGN3223 and pCGN7765 were digested with NotI and ligated together. The resultant 
vector, pCGN7770, contains the pCGN7765 backbone with the napin seed specific 

15 expression cassette from pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 
pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been 
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replaced with the double CAMV 35S promoter and the tml polyadenylation and 
transcriptional termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from 
pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The 
5 polylinker of pCGN1558 was replaced as a HindIII/Asp7 18 fragment with apolyUnker 

containing unique restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI, and Notl. 
The Asp? 18 and Hindlll restriction endonuclease sites are retained in pCGN5139. 

A series of turbo binary vectors are constructed to allow for the rapid cloning of DN A 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
1 0 transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligatihg oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' (SEQ ID NO:41) and 5'- 
TCGACGTGCAGGAAGCTTGCGGCCGCGGATCC-3' (SEQ ID NO:42) into SalI«:hoI- 
digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3' 
15 region was excised from pCGN8618 by digestion with Asp718I; the fragment was blunt- 
ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5 139 that 
had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
2 0 blunted HindHI site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated pCGN8622. 

The plasmid pCGN8619 was constmcted by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' (SEQ ID NO:43) and 5'- ' 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' (SEQ ID NO:44) into Sall/Xhol- 

2 5 digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3' 

region was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt- 
ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 
had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 

3 0 was closest to the blunted Asp7 1 81 site of pCGN5 139 and the napin 3' was closest to the 

blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated pCGN8623. 
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The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' (SEQ ID NO:45) and S'- 
CCTGCAGGAAGCTTGCGGCCGCGGATCC-S' (SEQ ID NO:46) into Sall/SacI-digested 
pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region was 
5 removed from pCGN8620 by complete digestion with Asp7 1 81 and partial digestion with 
Notl. The fragment was blunt-ended by filling in the 5' overhangs withKlenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5' overhangs with Klenow fragment. A plasmid containing the insert 
oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 

10 the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 
confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8624. 

The plasmid pCGN862i was constructed by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' (SEQ ID NO:47) and 5'- 

15 GGATCCGCGGCCGCAAGCTTCCTGCAGG-3' (SEQ ID NO:48) into Sall/SacI-digested 
pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region was 
removed from pCGN8621 by complete digestion with Asp718I and partial digestion with 
Notl. The fragment was blunt-ended by filling in the 5' overhangs withKlenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 

20 ended by filling in the 5' overhangs with Klenow fragment. A plasmid containing the insert 
oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 
confirm both the insert orientation and the integrity of cloning junctions. The resulting ' 
plasmid was designated pCGN8625. 

25 The plasmid construct pCGN8640 is a modification of pCGN8624 described above. 

A 938bp PstI fragment isolated from transposon Tn7 which encodes bacterial spectinomycin 
and streptomycin resistance (Fling et al. il9S5), Nucleic Acids Research 13(19):7095-7106), a 
determinant for E. coli and Agrobacterium selection, was blunt ended with Pfu polymerase. 
The blunt ended fragment was ligated into pCGN8624 that had been digested with Spel and 

3 0 blunt ended with Pfu polymerase. The region containing the PstI fragment was sequenced to 
confirm both the insert orientation and the integrity of cloning junctions. 

The spectinomycin resistance marker was introduced into pCGN8622 and pCGN8623 
as follows. A 7.7 Kbp Avrll-SnaBI fragment from pCGN8640 was ligated to a 10.9 Kbp 

25 

BNSDOCID: <WO 0063391 A2_l_> 



wo 00/63391 



PCT/US00/I0368 



Avrll-SnaBI fragment from pCGN8623 or pCGN8622, described above. The resulting 
plasmids were pCGN8641 and pCGN8643, respectively. 

The plasmid pCGN8644 was constructed by ligating oligonucleotides 5'- 
GATCACCTGCAGGAAGCTTGCGGCCGCGGATCCAATGCA-3' (SEQ ID NO:49) and 
5 5*- TTGGATCCGCGGCCGCAAGCTTCCTGCAGGT-3' (SEQ ID NO:50) into BamHI-Pstl 
digested pCGN8640. 

Synthetic oligonulceotides were designed for use in Polymerase Chain Reactions (PGR) 
to amplify the coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 for the 
preparation of expression constructs and are provided in Table 2 below, 

10 

Table 2: 



Name 


Restriction Site 


Sequence 


SEQ ID NO: 


ATPT2 


5' NotI 


GGATCCGCGGCCGCACAATGGAGTC 
TCTGCTCTCTAGTTCT 


51 


ATPT2 


3' Ssel 


GGATCCTGCAGGTCACTTCAAAAAA 
GGTAACAGCAAGT 


52 


ATPT3 


5' NotI 


GGATCCGCGGCCGCACAATGGCGTT 
rrrrGGGCTCTCCCGTGTTT 


53 


ATPT3 


3' Ssel 


GGATCCTGCAGGTTATTGAAAACTT 
CTTCCAAGTACAACT 


54 


ATPT4 


5' NotI 


GGATCCGCGGCCGCACAATGTGGCG 
AAGATCTGTTGTT 


55 


ATPT4 


3' Ssel 


GGATCCTGCAGGTCATGGAGAGTAG 
AAGGAAGGAGCT 


56 


ATPT8 


5' NotI 


GGATCCGCGGCCGCACAATGGTACT 
TGCCGAGGTTCCAAAGCTTGCCTCT 


57 


ATPT8 


3' Ssel 


GGATCCTGCAGGTCACTTGTTTCTG 
GTGATGACTCTAT 


58 


ATPT12 


5' NotI 


GGATCCGCGGCCGCACAATGACTTC 
GATTCTCAACACT 


59 ; 


ATPT12 


3' Ssel 


GGATCCTGCAGGTCAGTGTTGCGAT 
GCTAATGCCGT 


60 



The coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 were all 
amplified using the respective PGR primers shown in Table 2 above and cloned into the TopoTA 
15 vector (Invitrogen). Constructs containing the respective prenyltransferase sequences were 
digested with NotI and Sse8387I and cloned into the turbobinary vectors described above. 

The sequence encoding ATPT2 prenyltransferase was cloned in the sense orientation into 
pCGN8640 to produce the plant transformation construct pCGN 10800 (Figure 2). The ATPT2 
sequence is under control of the 35S promoter. 
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The ATPT2 sequence was also cloned in the antisense orientation into the construct 
pCGN8641 to create pCGN 10801 (Figure 3). This construct provides for the antisense 
expression of the ATPT2 sequence from the napin pronnoter. 

The ATPT2 coding sequence was also cloned in the antisense orientation into the vector 
5 pCGN8643 to create the plant transformation construct pCGN 10802 

The ATPT2 coding sequence was also cloned in the antisense orientation into the vector 
pCGN8644 to create the plant transformation construct pCGN 10803 (Figure 4). 

The ATPT4 coding sequence was cloned into the vector pCGN864 to create the plant 
transformation construct pCGN 10806 (Figure 5). The ATPT2 coding sequence was cloned into 

10 the vector pCGN864 to create the plant transformation construct pCGN10807(Figure 6), The 

ATPT3 coding sequence was cloned into the vector pCGN864 to create the plant transformation 
construct pCGN 10808 (Figure 7). The ATPT3 coding sequence was cloned in the sense 
orientation into the vector pCGN8640 to create the plant transformation construct pCGN 10809 
(Figure 8). The ATPT3 coding sequence was cloned in the antisense orientation into the vector 

15 pCGN8641 to create the plant transformation construct pCGN10810 (Figure 9). The ATPT3 
coding sequence was cloned into the vector pCGN8643 to create the plant transformation 
construct pCGN1081 1 (Figure 10). The ATPT3 coding sequence was cloned into the vector 
pCGN8640 to create the plant transformation constmct pCGN10812 (Figure 1 1). The ATPT4 
coding sequence was cloned into the vector pCGN8640 to create the plant transformation 

20 constmct pCGN10813 (Figure 12). The ATPT4 coding sequence was cloned into the vector 
pCGN8643 to create the plant transformation constmct pCGN10814 (Figure 13). The ATPT4 
coding sequence was cloned into the vector pCGN8641 to create the plant transformation 
constmct pCGN10815 (Figure 14). The ATPT4 coding sequence was cloned in the antisense 
orientation into the vector pCGN8644 to create the plant transformation constmct pCGN 1 08 16 

25 (Figure 15). The ATPT2 coding sequence was cloned into the vector pCGN???? to create the 

plant transformation construct pCGN10817 (Figure 16). The ATPT8 coding sequence was cloned 
in the sense orientation into the vector pCGN8643 to create the plant transformation construct 
pCGN10819 (Figure 17). The ATPT12 coding sequence was cloned into the vector pCGN8644 
to create the plant transformation construct pCGN 10824 (Figure 18). The ATPT12 coding 

3 0 sequence was cloned into the vector pCGN8641 to create the plant transformation construct 

pCGN 10825 (Figure 19). The ATPT8 coding sequence was cloned into the vector pCGN8644 to 
create the plant transformation constmct pCGN 10826 (Figure 20). 
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Example 3: Plant Transformation 

5 Transgenic Brassica plants are obtained by Agrobacterium-m&di2X€.d transformation 

as described by Radke et al (Theor, Appl. Genet. (1988) 75:685-694; Plant Cell Reports 
(1992) 77:499-505). Transgenic Arabidopsis thaliana plants may be obtained by 
Agrobacterium-m&d\2Xcd transformation as described by Valverkens et al., {Proc. Nat. Acad, 
ScL (1988) 55:5536-5540), or as described by Bent et al. ((1994), Science 265:1856-1860), or 
10 Bechtold et al. ((1993), C.R.Acad,Sci, Life Sciences 3 16: 1 194-1 199). Other plant species may 
be similarly transformed using related techniques. 

Alternatively, microprojectile bombardment methods, such as described by Klein et 
al. (Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

15 

Example 4: Identification of Additional Prenyltransferases 

A PSI-Blast profile generated using the E. colt ubiA (genbank accession 1790473) 
sequence was used to analyze the Synechocystis genome. This analysis identified 5 open 
2 0 reading frames (ORFs) in the Synechocystis genome that were potentially prenyltransferases; 
slr0926 (annotated as ubiA (4-hydroxybenzoate-octaprenyl transferase, SEQ ID NO:32), 
slll899 (annotated as ctaB (cytocrome c oxidase folding protein, SEQ ID NO:33), slr0056 
(annotated as g4 (chlorophyll synthase 33 kd subunit, SEQ ID NO: 34), slrl518 (annotated as 
menA (menaquinone biosynthesis protein, SEQ ID NO:35), and sir 1736 ( annotated as a 
25 hypothetical protein of unknown function (SEQ ID NO:36). 

To determine the functionality of these ORFs and their involvement, if any, in the 
biosynthesis of Tocopherols, knockouts constructs were made to disrupt the ORF identified in 
Synechocystis. 

Synthetic oligos were designed to amplify regions from the 5' (5'- 
30 TAATGTGTACATTGTCGGCCTC (17365') (SEQ ID NO:61) and 5'- 

GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCCACAATTCCCCGCA 
CCGTC (1736kanprl)) (SEQ ID NO:62) and 3' (5'-AGGCTAATAAGCACAAATGGGA 
(17363') (SEQ ID NO:63) and 5'- 
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GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGC 

GGAATTGGTTTAGGTTATCCC (1736kanpr2)) (SEQ ID NO:64) ends of the slrl736 ORF. 
The 1736kanprl and 1736kanpr2 oligos contained 20 bp of homology to the slrl736 ORF 
with an additional 40 bp of sequence homology to the ends of the kanamycin resistance 
5 cassette. Separate PGR steps were completed with these oligos and the products were gel 
purified and combined with the kanamycin resistance gene from puc4K (Pharmacia) that had 
been digested with Hindi and gel purified away from the vector backbone. The combined 
fragments were allowed to assemble without oligos under the following conditions: 94°C for 
1 min, 55°C for 1 min, 72*^C for 1 min plus 5 seconds per cycle for 40 cycles using pfu 

10 polymerase in lOOul reaction volume (Zhao, H and Arnold (1991) Nucleic Acids Res. 

25(6): 1307- 1308). One microliter or five microliters of this assembly reaction was then 
amplified using 5' and 3' oligos nested within the ends of the ORF fragment, so that the 
resulting product contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked 
out, the kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be 

15 knocked out. This PGR product was then cloned into the vector pGemT easy (Promega) to 
create the construct pMON2 1 68 1 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
following primers. The ubiA 5' sequence was amplified using the primers 5'- 

2 0 GGATCCATGGTT GCCCAAACCCCATC (SEQ ID NO:65) and 5'- 
GCAATGTAACATCAGAGA TTTTGAGACACAACG 

TGGCTTTGGGTAAGCAACAATGACCGGC (SEQ ID NO:66). The 3' region was 
amplified using the synthetic oligonucleotide primers 5'- 

GAATTCTCAAAGCCAGCCCAGTAAC (SEQ ID NO:67) and 5'<jGTAtGAGTC 

2 5 AGCAACACCTTCTTC ACG AGGCAGACCTCAGCGGGTGCG AAAAGGGTTTTCCC 

(SEQ ID NO:68). The amplification products were combined with the kanamycin resistance 
gene from puc4K (Pharmacia) that had been digested with Hindi and gel purified away from 
the vector backbone. The annealed fragment was amplified using 5' and 3' oligos nested 
within the ends of the ORF fragment (5'- CCAGTGGTTTAGGCTGTGTGGTC (SEQ ID 

3 0 NO:69) and 5'- CTGAGTTGGATGTATTGGATC (SEQ ID NO:70)), so that the resulting 

product contained 100-2(K) bp of the 5' end of the Synechocystis gene to be knocked out, the 
kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. 
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This PCR product was then cloned into the vector pGemT easy (Promega) to create the 
construct pMON2 1 682 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
5 following primers. The sll 1899 5' sequence was amplified using the primers 5*- 
GGATCCATGGTTACTT CGACAAAAATCC (SEQ ID NO:71) and 5'- 
GCAATGTAACATCAGAG 

ATTTTGAGACACAACGTGGCITTGCTAGGCAACCGCTTAGTAC (SEQ ID NO:72). 
The 3' region was amplified using the synthetic oligonucleotide primers 5'- 
10 GAATTCTTAACCCAACAGTAAAGTTCCC (SEQ ID NO:73) and 5'- 
GGTATGAGTCAGC 

AACACCTTCTTCACGAGGCAGACCTCAGCGCCGGCATTGTCTTTTACATG (SEQ ID 
NO:74). The amplification products were combined with the kanamycin resistance gene from 
puc4K (Pharmacia) that had been digested with Hindi and gel purified away from the vector 
15 backbone. The annealed fragment was amplified using 5' and 3' oligos nested within the 
ends of the ORF fragment (5'- GGAACCCTTGCAGCCGCTTC (SEQ ID NO:75) 
and 5'- GTATGCCCAACTGGTGCAGAGG (SEQ ID NO:76)), so that the resulting product 
contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 
kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. 

2 0 This PCR product was then cloned into the vector pGemT easy (Promega) to create the 

construct pMON21679 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
following primers. The slr0056 5' sequence was amplified using the primers 5'-' 
25 GGATCCATGTCTGACACACAAAATACCG (SEQ ID NO:77) and S'- 
GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCGCCAATACCAGCCA 
CCAACAG (SEQ ID NO:78). The 3' region was ampUfied using the synthetic 
oligonucleotide primers 5' - GAATTCTCAAAT CCCCGCATGGCCTAG (SEQ ID NO:79) 
and5'- 

3 0 GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGCGGCCTACGGCTTG 

GACGTGTGGG (SEQ ID NO:80). The amphfication products were combined with the 
kanamycin resistance gene from puc4K (Pharmacia) that had been digested with//mcll and 
gel purified away from the vector backbone. The annealed fragment was amplified using 5' 
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and 3' oligos nested within the ends of the ORF fragment (5'- 
CACTTGGATTCCCCTGATCTG (SEQ ID NO:81) and 5'- 

GCAATACCCGCTTGGAAAACG (SEQ ID NO:82)), so that the resulting product 
contained 100-200 bp of the 5* end of the Synechocystis gene to be knocked out, the 
5 kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. 
This PGR product was then cloned into the vector pGemT easy (Promega) to create the 
construct pMON21677 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
10 following primers. The slrl518 5' sequence was amplified using the primers 5'- 
GGATCCATGACCGAAT CTTCGCCCCTAGC (SEQ ID NO:83) and 5'- 
GCAATGTAACATCAGAGATTTTGA GACACAACGTGGC 

TTTCAATCCTAGGTAGCCGAGGCG (SEQ ID NO: 84). The 3' region was amplified 
using the synthetic oligonucleotide primers 5'-GAATTCTTAGCCCAGGCC AGCCCAGCC 

15 (SEQ ID NO:85)and 5'- GGTATGAGTCAGCAACACCTTCTTCACGA 

GGCAGACCTCAGCGGGGAATTGATTTGTTTAATTACC (SEQ ID NO:86). The 
amplification products were combined with the kanamycin resistance gene from puc4K 
(Pharmacia) that had been digested with HincU and gel purified away from the vector 
backbone. The annealed fragment was amplified using 5' and 3' oligos nested within the 

20 ends of the ORF fragment (5'- GCGATCGCCATTATCGCTTGG (SEQ ID NO:87) and 5'- 
GCAGACTGGCAATTATCAGTAACG (SEQ ID NO:88)), so that the resulting product 
contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 
kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. ' 
This PGR product was then cloned into the vector pGemT easy (Promega) to create the 

2 5 construct pMON2 1 680 and used for Synechocystis transformation. 

B. TTansformaiion of Synechocystis 

Cells of Synechocystis 6803 were grown to a density of approximately 2x10^ cells per 
ml and harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 medium 

3 0 (ATCC Medium 616) at a density of 1x10^ cells per ml and used inmiediately for 

transformation. One-hundred microliters of these cells were mixed with 5 ul of mini prep 
DNA and incubated with light at 30C for 4 hours. This mixture was then plated onto nylon 
filters resting on BG-1 1 agar supplemented with TES pH8 and allowed to grow for 12-18 
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hours. The fihers were then transferred to BG-1 1 agar + TES + 5ug/nil kanamycin and 
allowed to grow until colonies appeared within 7-10 days (Packer and Glazer, 1988). 
Colonies were then picked into BG-il liquid media containing 5 ug/ml kanamycin and 
allowed to grow for 5 days. These cells were then transferred to Bg-1 1 media containing 
5 lOug/ml kanamycin and allowed to grow for 5 days and then transferred to Bg-1 1 + 

kanamycin at 25ug/ml and allowed to grow for 5 days. Cells were then harvested for PCR 
analysis to determine the presence of a disrupted ORP and also for HPLC analysis to 
determine if the disruption had any effect on tocopherol levels. 

PCR analysis of the Synechocystis isolates for sir 1736 and sill 899 showed complete 

10 segregation of the mutant genome, meaning no copies of the wild type genome could be 

detected in these strains. This suggests that function of the native gene is not essential for cell 
function. HPLC analysis of these same isolates showed that the sill 899 strain had no 
detectable reduction in tocopherol levels. However, the strain carrying the knockout for 
slrl736 produced no detectable levels of tocopherol. 

15 The amino acid sequences for the Synechocystis knockouts are compared using 

ClustalW, and are provided in Table 3 below. Provided are the percent identities, percent 
similarity, and the percent gap. The alignment of the sequences is provided in Figure 21. 



Table 3: 





Slrl736 


slr0926 


sill 899 


slrO056 


slrl518 


sir 1736 %identity 
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12 


18 


11 


%similar 
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34 


26 


%gap 
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5 


slr0926 %identity 
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%similar 
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28 


%gap 
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sill 899 %identity 








17 
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%similar 
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%gap 
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slr0056 %identity 










15 


%similar 
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slrl518 %identity 




%similar 




%gap 





Amino acid sequence comparisons are performed using various Arabidopsis 
prenyltransferase sequences and the Synechocystis sequences. The comparisons are presented 
in Table 4 below. Provided are the percent identities, percent similarity, and the percent gap. 
5 The alignment of the sequences is provided in Figure 22. 
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4B. Preparation of the sir 1737 Knockout 

The Synechocystis sp, 6803 sir 1737 knockout was constructed by the following 
5 method. The GPS™-1 Genome Priming System (New England Biolabs) was used to insert, 
by a Tn7 Transposase system, a Kanamycin resistance cassette into sir 1737, A plasmid 
from a Synechocystis genomic library clone containing 652 base pairs of the targeted orf 
(Synechcocystis genome base pairs 1324051 - 1324703; the predicted orf base pairs 1323672 
- 1324763, as annotated by Cyanobase) was used as target DNA. The reaction was 

10 performed according to the manufacturers protocol. The reaction mixture wcis then 

transformed into E. coli DHIOB electrocompetant cells and plated. Colonies from this 
transformation were then screened for transposon insertions into the target sequence by 
amplifying with M13 Forward and Reverse Universal primers, yielding a product of 652 base 
pairs plus -1700 base pairs, the size of the transposon kanamycin cassette, for a total 

15 fragment size of -2300 base pairs. After this determination, it was then necessary to 

determine the approximate location of the insertion within the targeted orf, as 100 base pairs 
of orf sequence was estimated as necessary for efficient homologous recombination in 
Synechocystis. This was accomplished through amplification reactions using either of the 
primers to the ends of the transposon. Primer S (5' end) or N (3' end), in combination with 

2 0 either a M13 Forward or Reverse primer. That is, four different primer combinations were 
used to map each potential knockout constmct: Primer S - M13 Forward, Primer S - M13 
Reverse, Primer N - M13 Forward, Primer N - M13 Reverse. The construct used to 
transform Synechocystis and knockout sir 1737 was determined to consist of a approximately 
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150 base pairs of slrl737 sequence on the 5' side of the transposon insertion and 
approximately 500 base pairs on the 3' side, with the transcription of the orf and kanamycin 
cassette in the same direction. The nucleic acid sequence of sir 1737 is provided in SEQ ID 
NO:38 the deduced amino acid sequence is provided in SEQ ID NO:39. 
5 Cells of Synechocystis 6803 were grown to a density of - 2x10^ cells per ml and 

harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 medium at a 
density of 1x10^ cells per ml and used immediately for transformation. 100 ul of these cells 
were mixed with 5 ul of mini prep DNA and incubated with light at 30C for 4 hours. This 
mixture was then plated onto nylon filters resting on BG-1 1 agar supplemented with TES ph8 

10 and allowed to grow for 12-18 hours. The filters were then transferred to BG-1 1 agar + TES 
+ 5ug/ml kanamycin and allowed to grow until colonies appeared within 7-10 days (Packer 
and Glazer, 1988). Colonies were then picked into BG-1 1 liquid media containing 5 ug/ml 
kanamycin and allowed to grow for 5 days. These cells were then transferred to Bg-1 1 media 
containing lOug/ml kanamycin and allowed to grow for 5 days and then transferred to Bg-1 1 

15 + kanamycin at 25ug/ml and allowed to grow for 5 days. Cells were then harvested for PCR 
analysis to determine the presence of a disrupted ORF and also for HPLC analysis to 
determine if the disruption had any effect on tocopherol levels. 

PCR analysis of the Synechocystis isolates, using primers to the ends of the sir 1737 
orf , showed complete segregation of the mutant genome, meaning no copies of the wild type 

2 0 genome could be detected in these strains. This suggests that function of the native gene is 

not essential for cell function. HPLC analysis of the strain carrying the knockout fox sir 1737 
produced no detectable levels of tocopherol. 

■ * 

4C. Phytyl Prenyltransferase Enzyme Assays 
25 ["^H] Homogentisic acid in 0, 1% H3PO4 (specific radioactivity 40 Ci/nmnol). Phytyl 

pyrophosphate was synthesized as described by Joo, et al. (1973) Can 7. Biochenu 51:1527. 
2-methyl-6-phytylquinol and 2,3-dimethyl-5-phyty]quinol were synthesized as described by 
Soli, era/. (1980) Phytochemistry 19:215. Homogentisic acid, a, (3, 5, and y-tocopherol, and 
tocol, were purchased commercially. 

3 0 The wild-type strain of Synechocystis sp. PCC 6803 was grown in BGl 1 medium with 

bubbling air at 30*^C under 50 |xE.m '.s'^ fluorescent light, and 70% relative humidity. The 
growth medium of slrl736 knock-out (potential PPT) strain of this organism was 
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supplemented with 25 |ig mL"^ kanamycin. Ceils were collected from 0.25 to 1 liter culture by 
centrifugation at 5000^ for 10 min and stored at -80°C. 

Total membranes were isolated according to Zak's procedures with some modifications 
(Zak, €t al (1999) Eur J, Biochem 261:31 1). Cells were broken on a French press. Before the 
5 French press treatment, the cells were incubated for 1 hour with lysozyme (0.5%, w/v) at 30 ""C 
in a medium containing 7 mM EDTA, 5 mM NaCl and 10 mM Hepes-NaOH, pH 7.4. The 
spheroplasts were collected by centrifugation at 5000 for 10 min and resuspended at 0.1 - 0.5 
mg chlorophyll-mL"' in 20 mM potassium phosphate buffer, pH 7.8. Proper amount of 
protease inhibitor cocktail and DNAase I from Boehringer Mannheim were added to the 

10 solution. French press treatments were performed two to three times at 100 MPa. After 

breakage, the cell suspension was centrifugedfor 10 min at 5000g to pellet unbroken cells, and 
this was followed by centrifugation at 100 000 g for 1 hour to collect total membranes. The 
final pellet was resuspended in a buffer containing 50 mM Tris-HCL and 4 mM MgCh- 
Chloroplast pellets were isolated from 250 g of spinach leaves obtained from local 

15 markets. Devined leaf sections were cut into grinding buffer (2 1 /250 g leaves) containing 2 
mM EDTA, 1 mM MgCh, 1 mM MnCb, 0.33 M sorbitol, 0.1% ascorbic acid, and 50 mM 
Hepes at pH 7.5. The leaves were homogenized for 3 sec three times in a 1-Lblendor, and 
filtered through 4 layers of mirocloth. The supernatant was then centrifuged at 500Qg for 6 
min. The chloroplast pellets were resuspended in small amount of grinding buffer (Douce,ef 

20 al Methods in Chloroplast Molecular Biology, 239 (1982) 

Chloroplasts in pellets can be broken in three ways. Chloroplast pellets were first 
aliquoted in 1 mg of chlorophyll per tube, centrifuged at 6000 rpm for 2 min in 
microcentrifuge, and grinding buffer was removed. Two hundred microliters of Triton X-lOt) 
buffer (0.1% Triton X-lOO, 50 mM Tris-HCl pH 7.6 and 4 mM MgCh) or swelling buffer (10 

25 mM Tris pH 7.6 and 4 niM MgCb) was added to each tube and incubated for Vi hour at 4°C. 
Then the broken chloroplast pellets were used for the assay immediately. In addition, broken 
chloroplasts can also be obtained by freezing in liquid nitrogen and stored at -SOPC for V2 hour, 
then used for the assay. 

In some cases chloroplast pellets were further purified with 40%/ 80% percoll gradient to 

3 0 obtain intact chloroplasts. The intact chloroplasts were broken with swelling buffer, then 

either used for assay or further purified for envelope membranes with 20.5%/ 31.8% sucrose 
density gradient (Sol, et al (1980) supra). The membrane fractions were centrifuged at 100 
OOOg for 40 min and resuspended in 50 mM Tris-HCl pH 7.6, 4 mM MgCh. 
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Various amounts of [^HJHGA, 40 to 60 |iM unlabelled HGA with specific activity in the 
range of 0.16 to 4 Ci/mmole were mixed with a proper amount of IM Tris-NaOH pH 10 to 
adjust pH to 7.6. HGA was reduced for 4 min with a trace amount of solid NaBH4- In 
addition to HGA, standard incubation mixture (final vol 1 mL) contained 50 mM Tris-HCl, pH 
5 7.6, 3-5 mM MgCb, and 100 fiM phytyl pyrophosphate. The reaction was initiated by 
addition of Synechocystis total membranes, spinach chloroplast pellets, spinach broken 
chloroplasts, or spinach envelope membranes. The enzyme reaction was carried out for 2 hour 
at 23^C or 30°C in the dark or light. The reaction is stopped by freezing with liquid nitrogen, 
and stored at -SO'^C or directly by extraction. 

10 A constant amount of tocol was added to each assay mixture and reaction products were 

extracted with a 2 mL mixture of chloroform/methanol (1:2, v/v) to give a monophasic 
solution. NaCl solution (2 mL; 0.9%) was added with vigorous shaking. This extraction 
procedure was repeated three times. The organic layer containing theprenylquinones was 
filtered through a 20 m^ filter, evaporated under N2, and then resuspended in 100 ^L of 

15 ethanol. 

The samples were mainly analyzed by Normal-Phase HPLC method (Jsocratic 90% 
Hexane and 10% Methyl-t-butyl ether), and use aZorbax silica column, 4.6 x 250 mm. The 
samples were also analyzed by Reversed-Phase HPLC method (Isocratic 0.1% H3PO4 in 
MeOH), and use a Vydac 201HS54 C18 column; 4.6 x 250 mm coupled with an All-tech C18 

2 0 guard colunrn. The amount of products were calculated based on the substrate specific 

radioactivity, and adjusted according to the % recovery based on the amount of internal 
standard. 

The amount of chlorophyll was determined as described in Amon (1949) Plant Physiol \ 
24:1. Amount of protein was determined by the Bradford method using gamrha globulin as a 
25 standard (Bradford, (1976) A/ia/. Biochem, 72:248) 

Results of the assay demonstrate that 2-Methyl-6-Phytylplastoquinone is produced in 
the Synechocystis slrl736 knockout preparations. The results of the phytyl prenyltransferase 
enzyme activity assay for the slrl736 knock out are presented in Figure 23. 

3 0 4D. Complementation of the sir 1736 knockout with ATPT2 

In order to determine whether ATPT2 could complement the knockout of sir 1736 in 
Synechocystis 6803 a plasmid was constructed to express the ATPT2 sequence from the TAG 
promoter. A vector, plasmid psll21 1, was obtained from the lab of Dr. Himadri Pakrasi of 
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Washington University, and is based on the plasmid RSFlOlO which is a broad host range 
plasmid (Ng W.-O., Zentella R., Wang, Y., Taylor J-S. A., Pakrasi, H.B. 2000. p/irA, the 
major photoreactivating factor in the cyanobacterium Synechocystis sp. strsiin PCC 6803 
codes for a cyclobutane pyrimidine dimer specific DN A photolyase. Arch, Microbiol, (in 
5 press)). The ATPT2 gene was isolated from the vector pCGN 10817 by PGR using the 
foUowing primers. ATPT2nco.pr 5'-CCATGGATTCGAGTAAAGTTGTCGC (SEQ ID 
NO:89); ATPT2ri.pr- 5'-GAATTCACTTCAAAAAAGGTAACAG (SEQ ID NO:90). These 
primers will remove approximately 1 12 BP from the 5' end of the ATPT2 sequence, which is 
thought to be the chloroplast transit peptide. These primers will also add an Ncol site at the 

10 5' end and an EcoRI site at the 3' end which can be used for sub-cloning into subsequent 

vectors. The PGR product from using these primers and pGGNlOS 17 was ligated into pGEM 
T easy and the resulting vector pMON21689 was confirmed by sequencing using the 
ml3forward and ml3reverse primers. The NcoI/EcoRI fragment from pMON21689 was then 
ligated with the Eagl/EcoRI and Eagl/Ncol fragments from psll21 1 resulting in 

15 pMON21690. The plasmid pMON21690 was introduced into the sir 1736 Synechocystis 6803 
KO strain via conjugation. Cells of sl906 (a helper strain) and DHIOB cells containing 
pMON21690 were grown to log phase (O.D. 600= 0.4) and 1 ml was harvested by 
centrifugation. The cell pellets were washed twice with a sterile BG-1 1 solution and 
resuspended in 200 ul of BG-1 1 . The following was mixed in a sterile eppendorf tube: 50 ul 

2 0 SL906, 50 ul DHIOB cells containing pMON21690, and 100 ul of a fresh culture of the 
slrl736 Synechocystis 6803 KO strain (O.D. 730 = 0.2-0.4). The cell mixture was 
inunediately transferred to a nitrocellulose filter resting on BG-1 1 and incubated for 24 hours 
at 30G and 2500 LUX(50 ue) of light. The filter was then transferred to BG-1 1 supplemented 
with lOug/ml Gentamycin and incubated as above for -5 days. When colonies appeared, they 

2 5 were picked and grown up in liquid BG- 1 1 + Gentamycin 10 ug/ml. (Elhai, J. and Wolk, P. 

1988. Gonjugal transfer of DNA to Cyanobacteria. Methods in Enzymology 167, 747-54) 
The liquid cultures were then assayed for tocopherols by harvesting 1ml of culture by 
centrifugation, extracting with ethanol/pyrogallol, and HPLG separation. The slrl736 
Synechocystis 6803 KO strain, did not contain any detectable tocopherols, while the slrl736 

3 0 Synechocystis 6803 KO strain transformed with pmon21690 contained detectable alpha 

tocopherol. A Synechocystis 6803 strain transformed with psl 12 1 1 (vector control) produced 
alpha tocopherol as well. 

38 

3NSDOCID: <WO_0063391 A2_L> 



wo 00/63391 



PCT/USOO/10368 



Example 5: Transgenic Plant Analysis 

Arabidopsis plants transformed with constructs for the sense or antisense expression 
5 of the ATPT proteins were analyzed by High Pressure Liquid Chromatography (HPLC) for 
altered levels of total tocopherols, as well as altered levels of specific tocopherols (alpha, 
beta, gamma, and delta tocopherol). 

Extracts of leaves and seeds were prepared for HPLC as follows. For seed extracts, 
10 mg of seed was added to 1 g of microbeads (Biospec) in a sterile microfuge tube to which 
1 0 500 ul 1 % pyrogallol (Sigma Chem)/ethanol was added. The mixture was shaken for 3 

minutes in a mini Beadbeater (Biospec) on "fast'* speed. The extract was filtered through a 
0.2 um filter into an autosampler tube. The filtered extracts were then used in HPLC analysis 
described below. 

Leaf extracts were prepared by mixing 30-50 mg of leaf tissue with 1 g microbeads 
15 and freezing in liquid nitrogen until extraction. For extraction, 500 ul 1% pyrogallol in 
ethanol was added to the leaf/bead mixture and shaken for 1 minute on a Beadbeater 
(Biospec) on *'fast" speed. The resulting mixture was centrifiiged for 4 minutes at 14,000 rpm 
and filtered as described above prior to HPLC analysis. 

HPLC was performed on a Zorbax silica HPLC column (4.6 nun X 250 mm) with a 

2 0 fluorescent detection, an excitation at 290 nm, an emission at 336 nm, and bandpass and slits. 

Solvent A was hexane and solvent B was methyl-t-butyl ether. The injection volume was 20 
ul, the flow rate was 1.5 ml/min, the run time was 12 min (40°C) using the gradient (Table 5): 

Tables: - 

25 Time Solvent A Solvent B 

0 min. 90% 10% 

10 min. 90% 10% 

11 min. 25% 75% 

12 min. 90% 10% 

30 

Tocopherol standards in 1 % pyrogallol/ ethanol were also run for comparison (alpha 
tocopherol, gamma tocopherol, beta tocopherol, delta tocopherol, and tocopherol (tocol) (all 
from Matreya). 

Standard curves for alpha, beta, delta, and gamma tocopherol were calculated using 

3 5 Chemstation software. The absolute amount of component x is: Absolute amount of x= 
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Responsex x RFx x dilution factor where Responscx is the area of peak x, RFx is the response 
factor for component x (Amountx/Responsex) and the dilution factor is 500 ul. The ng/mg 
tissue is found by: total ng component/mg plant tissue. 

Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines 
5 containing pMON 10822 for the expression of ATAT2 from the napin promoter are provided 
in Figure 24. 

HPLC analysis results of Arabidopsis seed tissue expressing the ATAT2 sequence 
from the napin promoter (pMON 10822) demonstrates an increased level of tocopherols in the 
seed. Total tocopherol levels are increased as much as 50 to 60% over the total tocopherol 

10 levels of non-transformed (wild-type) Arabidopsis plants (Figure 24). 

Furthermore, increases of particular tocopherols are also increased in transgenic 
Arabidopsis plants expressing the ATAT2 nucleic acid sequence from the napin promoter. 
Levels of delta tocopherol in these lines are increased greater than 3 fold over the delta 
tocopherol levels obtained from the seeds of wild type Arabidopsis lines. Levels of gamma 

15 tocopherol in transgenic Arabidopsis lines expressing the ATAT2 nucleic acid sequence are 
increased as much as about 60% over the levels obtained in the seeds of non-transgenic 
control lines. Furthermore, levels of alpha tocopherol are increased as much as 3 fold over 
those obtained from non-transgenic control lines. 

Results of the HPLC analysis of seed extracts of Xxansgcnic Arabidopsis lines 

20 containing pMON 10803 for the expression of ATAT2 from the enhanced 35S promoter are 
provided in Figure 25. 

All publications and patent applications mentioned in this specification are indicative 
25 of the level of skill of those skilled in the art to which this invention pertains. All 

publications and patent applications are herein incorporated by reference to the same extent as 
if each individual publication or patent application was specifically and individually indicated 
to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
3 0 illustration and example for purposes of clarity of understanding, it will be obvious that 

certain changes and modifications may be practiced within the scope of the appended claim. 
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Claims 

What is Claimed is: 

I. An isolated nucleic acid sequence encoding a prenyltransferase. 

5 2. An isolated nucleic acid sequence according to Claim 1, wherein said prenyltransferase is 

selected from the group consisting of straight chain prenyltransferase and aromatic prenyltransferase. 

3. An isolated DNA sequence according to Claim 1, wherein said nucleic acid sequence is 
isolated from a eukaryotic cell source. 

4. An isolated DNA sequence according to Claim 3, wherein said eukaryotic cell source is 
10 selected from the group consisting of mammalian, nematode, fungal, and plant cells. 

5. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
Arabidopsis. 

6. The DNA encoding sequence of Claim 5 wherein said prenyltransferase protein is encoded 
by a sequence selected from the group consisting of the sequences of Figure 1. 

15 7. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 

com. 

8. The DNA encoding sequence of Claim 7 wherein said prenyltransferase protein is encoded 
by a sequence which includes the EST of the sequences of Figure 3. 

9. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
20 soybean. 

10. The DNA encoding sequence of Claim 9 wherein said prenyltransferase protein is 
encoded by a sequence which includes the ESTs of the group consisting of the sequences of Figure 2 
and Figure 9. 

II. An isolated DNA sequence according to Claim 1, wherein said nucleic acid sequence is 
25 isolated from a prokaryotic cell source. 

12. An isolated DNA sequence according to Claim 1 1, wherein said prokaryotic source is 
Synechocystis, 

13. A nucleic acid construct comprising as operably linked components, a transcriptional 
initiation region functional in a host cell, a nucleic acid sequence encoding prenyltransferase, and a 

3 0 transcriptional termination region. 

14. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence 
encoding prenyltransferase is obtained from an organism selected from the group consisting of a 
eukaryotic organism and a prokaryotic organism. 
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15. A nucleic acid construct according to Claim 14, wherein said nucleic acid sequence 
encoding prenyltransferase is obtained from a plant source. 

16. A nucleic acid construct according to Claim 15, wherein said nucleic acid sequence 
encoding prenyltransferase is obtained from a source selected from the group consisting of 

5 Arabidopsis, soybean and com. 

17. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence 
encoding prenyltransferase is obtained from Synechocystis. 

18. A plant cell comprising the construct of CIaiml3. 

19. A method for the alteration of the tocopherol content in a host cell, comprising; 
10 transforming said host cell with a construct comprising as operably linked components, a 

transcriptional initiation region functional in a host cell, a nucleic acid sequence encoding 
prenyltransferase, and a transcriptional termination region. 

20. The method according to Claim 19, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

15 21. The method according to Claim 20, wherein said prokaryotic cell is Synechocystis. 

22. The method according to Claim 20, wherein said eukaryotic cell is a plant cell. 

23. The method according to Claim 22, wherein said plant cell is obtained from a plant 
selected from the group consisting of Arabidopsis, soybean, and com. 

24. A method for producing a tocopherol compound of interest in a host cell, said method 
20 comprising obtaining a transformed host cell, said host cell having and expressing in its genome: 

a constmct having a DNA sequence encoding a prenyltransferase operably linked to a 
transcriptional initiation region functional in a host cell, 

wherein said prenyltransferase is involved in the synthesis of tocopherols. ' 

25. The method according to Claim 24, wherein said host cell is selected from the group 

2 5 consisting of a prokaryotic cell and a eukaryotic cell. 

26. The method according to Claim 25, wherein said prokaryotic cell is Synechocystis. 

27. The method according to Claim 24, wherein said eukaryotic cell is a plant cell. 

28. The method according to Claim 27, wherein said plant cell is obtained from a plant 
selected from the group consisting of Arabidopsis, soybean, and com. 

3 0 29. A method for increasing the biosynthetic flux in cell from a host cell toward 

tocopherol production, said method comprising transforming said host cell with a construct 
comprising as operably linked components, a transcriptional initiation region functional in a 
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host cell, a DNA encoding a prenyl transferase involved in the synthesis of tocopherols, and a 
transcriptional termination region. 

30. The method according to Claim 29, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

31. The method according to Claim 30, wherein said prokaryotic cell is Synechocystis, 

32. The method according to Claim 30, wherein said eukaryotic cell is a plant cell. 

33. The method according to Claim 32, wherein said plant cell is obtained from a plant 
selected from the group consisting of Arabidopsis, soybean, and com. 
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BNSDOCID: <WO 0063391 A2_l_> 



wo 00/63391 



PCT/USOO/10368 



Pvull 121 
EcoRI 296 
Sac! 306 
Ncol 346 
StuI 486 
Ncol 736 

EcoRV 



EcoRV 16958 
Bglll 16762 



Sac! 15209 
EcoRV 15111 
Clal 15097 
Pvul 15097 

Pvul 14525 

Xhol 13882 

Pvul 13451 
Oral 13296 
Sad 13177 
Pvull 13167 
Bglll 12895 
Xhol 12284 
Pvul 11776 
Pvull 11641 
Sad 11529 
Bglll 11506 
Smal 11491 
EcoRI 11161 
EcoRI 11013 
Pvul 10873 
Pvull 10624 
Ncol 10530 
EcoRV 10474 

Sail 9837 
Pvul 9802 
Xhol 9772 
Bglll 9716 



1444 
BamHI 1659 
Not! 1666 
StuI 1837 
StuI 1990 
StuI 2482 
EcoRV 2554 
Xhol 2748 
PstI 2863 
Sad 2869 




Pvull 4116 
Pvul 4147 
Xhol 4196 
Pvull 4432 
EcoRV 4727 
Sail 4937 
EcoRI 4949 
PvutI 5204 
Ncol 5530 
EcoRI 5939 
Smal 5946 



Xhol 7088 
Pvul 7098 
Oral 7550 
Pvul 7634 
Xhol 8079 
Smal 8243 
EcoRV 8274 
EcoRV 8446 
EcoRV 8736 
9386 
9585 



Figure 2 
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Pvull 121 
EcoRI 296 
Sad 306 
Sad 645 

EcoRV 



EcoRV 17488 
Bglll 17292 

Sad 15739 
EcoRV 15641 
Clal 15627 
Pvul 15627 

Pvul 15055 

Xhol 14412 

Pvul 13981 
Oral 13826 
Sad 13707 
Pvull 13697 
Bglll 13425 
Xhol 12814 
Pvul 12306 
Pvull 12171 
Sad 12059 
Bglll 12036 
Smal 12021 
EcoRI 11691 
EcoRI 11543 
Pvul 11403 
Pvull 11154 
Ncol 11060 
EcoRV 11004 

Sail 10367 
Pvul 10332 
Xhol 10302 
~ " Bglll 10246 
EcoRV 10115 
Pvul 9916 



1549 
Xbal 1578 
EcoRI 1773 
Sail 2075 
PstI 2085 
Xhol 2192 
EcoRV 2390 
Stui 2462 
StuI 2954 




StuI 3107 
Noti 3274 
BamHI 3281 
Xhol 3286 
BamHI 3603 
EcoRV 4447 
Pvull 4646 
Pvul 4677 
Xhol 4726 
Pvull 4962 
EcoRV 5257 
Sail 5467 
EcoRI 5479 
Pvull 5734 
Ncot 6060 
EcoRI 6469 
Smal 6476 



Xhol 
Pvul 
Dral 8080 
Pvul 8164 
Xhol 8609 
Smal 8773 
EcoRV 8804 
EcoRV 8976 
EcoRV 9266 



7618 
7628 



Figure 3 
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Pvull 121 
EcoRI 296 
Sad 306 
Ncol 346 
StuI 486 
Ncot 736 

EcoRV 



EcoRV 16972 
Bglll 16776 



1444 
PstI 1670 
Xhol 1777 
EcoRV 1975 
StuI 2047 
StuI 2539 
StuI 2692 
iNotI 2859 
/BamHI 2866 
2883 




Pvul 13465 
Oral 13310 
Sad 13191 
Pvull 13181 
Bglll 12909 
Xhol 12298 
Pvul 11790 
Pvull 11655 
Sad 11543 
Bglll 11520 
Smal 11505 
EcoRl 11175 
EcoRI 11027 
Pvul 10887 
Pvull 10638 
Ncol 10544 
EcoRV 10488 

Sail 9851 
Pvul 9816 
Xhol 9786 
Bglll 9730 



Pvull 4130 
Pvul 4161 
Xhol 4210 
Pvull 4446 
EcoRV 4741 
— tSall 4951 
J--^^ EcoRI 4963 
Pvull 5218 
Ncol 5544 
EcoRI 5953 
Smal 5960 



Xhol 
Pvul 
Dral 7564 
Pvul 7648 
Xhol 8093 
Smal 8257 
EcoRV 8288 
EcoRV 8460 
EcoRV 8750 
9400 
9599 



7102 
7112 



Figure 4 
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EcoRi 5221 
BamHl 5190 
Sad 5188 
Kpnl 5182 
Hindlll 5172 
Pvull 4993 



BamHl 1 
NotI 8 
Bglll 27 

Bglll 186 
Xbal 261 

Smal 406 



Dral 4058 
Dral 4039 



Pvul 3556 




Pvull 792 
Pvull 861 



Dral 3347 



PstI 1319 
BamHl 1320 
EcoRI 1331 
PstI 1340 
EcoRV 1343 
NotI 1358 
Xhol 1364 
Xbal 1376 
Pvull 1506 
Pvul 1537 



Ncol 2886 



Pvull 2200 
Bglll 2290 



PstI 2507 
Pvull 2560 



Figure 5 
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EcoR! 5107 
BamHI 5076 
Sad 5074 
Kpnl 5068 vi 
Hind! I i 5058 \\l 
Pvuil 4879 



BamHI 1 
Not! 8 

yStuI 179 
/ rStu! 332 



Oral 3944 
Oral 3925 



Pvu! 3442 




StuI 824 
EcoRV 896 
Xhol 1090 
PstI 1205 
BamHI 1206 
EcoRI 1217 
PstI 1226 
EcoRV 1229 
NotI 1244 
Xhol 1250 
Xbal 1262 

Pvull 1392 
Pvul 1423 



Oral 3233 



Ncoi 2772 



Pvull 2086 
Bglll 2176 
PstI 2393 
Pvull 2446 



Figure 6 
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BamHI 1 
Not! 8 



Hindlll 352 
Hindill 402 
Oral 503 



Oral 4305 
Oral 4286 



Pstl 673 
PstI 845 




Ncol 3133 



Pvull 2807 
Pstl 2754 



Hindlll 1002 
Bglll 1067 
Pvull 1137 
Pstl 1164 

Pstl 1247 
Pvull 1308 



Hindlll 1529 
Kpni 1539 
Sad 1545 
BamHI 1547 
EcoRI 1578 
Pstl 1587 
EcoRV 1590 
NotI 1605 
Xhol 1611 
Xbal 1623 
Pvull 1753 
Pvul 1784 



Figure 7 
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Pvull 121 
EcoRI 296 
Sad 306 
Ncol 346 
StuI 486 
Ncol 736 

EcoRV 



EcoRV 17000 
Bglll 16804 

Sad 15251 
EcoRV 15153 
Ciai 15139 
Pvul 15139 

Pvu! 14567 

Xhol 13924 

Pvul 13493 
Oral 13338 
Sad 13219 
Pvull 13209 
Bglll 12937 
Xhol 12326 
Pvul 11818 
Pvuli 11683 
Sad 11571 
Bglll 11548 
Smal 11533 
EcoRI 11203 
EcoRI 11055 
Pvul 10915 
- Pvull 10666 
Ncol 10572 
EcoRV 10516 
Sail 



1444 
BamHi 1659 
Not! 1666 
Hindlll 2010 
Hindu I 2060 
Dral 2161 
PstI 2331 
PstI 2503 
Hindlll 2660 
Bglll 2725 
Pvull 2795 
PstI 2822 
PstI 2905 
Sad 2911 




Pvull 4158 
Pvul 4189 
Xhol 4238 
Pvull 4474 
•EcoRV 4769 
Sail 4979 
EcoRI 4991 
^ Vvull 5246 
Ncol 5572 
EcoRI 5981 
Smal 5988 



9879 
Pvul 9844 
Xhol 9814 ' 
Bglll 9758 



Figure 8 
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Pvull 121 
EcoRI 296 
Sad 306 
Sac! 645 

EcoRV 




EcoRV 17530 
Bgill 17334 

Sad 15781 
EcoRV 15683 
Clal 15669 
Pvul 15669 

Pvul 15097 

Xho! 14454 

Pvul 14023 
Dra! 13868 
Sad 13749 
Pvull 13739 
Bglll 13467 
Xhol 12856 
Pvul 12348 
Pvull 12213 
Sad 12101 
Bglll 12078 
Smal 12063 
EcoRI 11733 
EcoRI 11585 
Pvul 11445 
Pvull 11196 
Ncol 11102 
EcoRV 11046 

Sail 10409 
Pvul 10374 
Xhol 10344 
Bglll 10288 
EcoRV 10157 

Pvul 9958 



1549 
Xbal 1578 
EcoRI 1773 
Sail 2075 
PstI 2085 
PstI 2168 
Pvull 2191 
Bglll 2257 
Hindlll 2322 
PstI 2487 
PstI 2659 
Oral 2825 
Hindlll 2922 
Hindlll 2972 
NotI 3316 
BamHI 3323 
Xhol 3328 
BamHI 3645 

y EcoRV 4489 
Pvull 4688 
Pvul 4719 
■ — Xhol 4768 
Pvull 5004 
EcoRV 5299 
^Sall 5509 
EcoRI 5521 
Pvull 5776 
Ncol 6102 
EcoRI 6511 
Smal 6518 

7660 
7670 



Figure 9 
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Saci 19807 



EcoRV 17530 
BglM 17334 



Sad 15781 
EcoRV 15683 
Clal 15669 
Pvul 15669 

Pvul 15097 - 



Xho! 14454 




Pvull 121 
EcoRI 296 
Sad 306 
Sad 645 

EcoRV 1549 
Xbal 1578 
EcoRl 1773 
BamHI 2080 
NotI 2087 
HindMI 2431 
Hindlll 2481 
Oral 2582 
PstI 2752 
PstI 2924 



I 



Spc/Str 




"^adA RB 
polylinker 
Napin Promoter 
^^^^^ Turbo linker 

ATPT3 
Turbo linker 



PCGN10811 
19830 bp 



Napin 3' 

polylinker 
35S promoter 



7660 
7670 




Pvul 14023 
Oral 13868 
Sad 13749 
Pvull 13739 
Bglll 13467 
Xhol 12856 
Pvul 12348 
Pvull 12213 
Sad 12101 
Bglll 12078 
Smal 12063 
EcoR! 11733 
EcoRI 11585 
Pvul 11445 
Pvull 11196 
Ncol 11102 
EcoRV 11046 

■ Sail 10409 
Pvul 10374 
Xho! 10344 
Bglll 10288 
EcoRV 10157 
Pvul 9958 



Hindlll 3081 
Bglll 3146 
Pvull 3216 
PstI 3243 
PstI 3326 
BamHI 3645 
EcoRV 4489 
Pvull 4688 
Pvul 4719 
H — Xhol 4768 
g — Pvull 5004 
-EcoRV 5299 
Sail 5509 
EcoRI 5521 
Pvull 5776 
Ncol 6102 
EcoRI 6511 
Smal 6518 



Xhol 
Pvul 
Oral 8122 
Pvul 8206 
Xhol 8651 
Smal 8815 
EcoRV 8846 
EcoRV 9018 
' EcoRV 9308 



Figure 10 
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Pvull 121 
EcoRI 296 
Sad 306 
Ncol 346 
StuI 486 
Ncol 736 

EcoRV 




1444 
PstI 1670 
Psti 1753 
Pvull 1776 
Bgin 1842 
Hindlll 1907 
PstI 2072 
PstI 2244 
Dral 2410 
Hindlll 2507 
Hindlll 2557 
Not! 2901 
BamHI 2908 
Sad 2925 



Pvull 4172 
Pvul 4203 
Xhol 4252 
Pvull 4488 
■EcoRV 4783 
Sail 4993 
EcoRI 5005 



Pvul 13507 
Dral 13352 
Sad 13233 
Pvull 13223 
Bglll 12951 
Xhol 12340 
Pvul 11832 
Pvull 11697 
Sad 11585 
Bglll 11562 
Smal 11547 
EcoRI 11217 
EcoRI 11069 
Pvul 10929 
Pvull 10680 
Ncol 10586 
EcoRV 10530 

Sail 9893 
Pvul 9858 
Xhol 9828 
Bglll 9772 



Pvull 5260 
Ncol 5586 
EcoRI 5995 
Smal 6002 



7144 
7154 



'Sma! 8299 
EcoRV 8330 
EcoRV 8502 
EcoRV 8792 
9442 
9641 



Figure 1 ] 
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Pvull 121 
EcoRI 296 
Sad 306 
Ncol 346 
StuI 486 
Ncol 736 

EcoRV 



EcoRV 17072 
Bglll 16876 



Sad 15323 
EcoRV 15225 
Clal 15211 
Pvul 15211 

Pvul 14639 

Xhol 13996 

Pvul 13565 
Oral 13410 
Sad 13291 
Pvull 13281 
Bglll 13009 
Xhol 12398 
Pvul 11890 
Pvull 11755 
Sad 11643 
Bglll 11620 
Smal 11605 
EcoRI 11275 
EcoRI 11127 
Pvul 10987 
Pvull 10738 
Ncol 10644 
EcoRV 10588 

Sail 9951 
- Pvul 9916 
Xhol 9886 

Bglll 9830 
EcoRV 9699 



1444 
BamHl 1659 
NotI 1666 
I Bglll 1685 
I /Bglll 1844 
Xbal 1919 
Smal 2064 
Pvull 2450 
Pvull 2519 
PstI 2977 
Sad 2983 




Pvull 4230 
Pvul 4261 
Xhol 4310 
Pvull 4546 
EcoRV 4841 
Sail 5051 
EcoRI 5063 
Pvull 5318 
Ncol 5644 
EcoRI 6053 
Smal 6060 



Xhol 
Pvul 
Oral 7664 
Pvul 7748 
Xhol 8193 
Smal 8357 
EcoRV 8388 
EcoRV 8560 
EcoRV 8850 
9500 



7202 
7212 



Figure 12 
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Pvull 121 
EcoRI 296 
Sad 306 
Sad 645 

EcoRV 1549 



Xbal 1578 
EcoRI 1773 
Sail 2075 
Pstt 2085 
Pvull 2539 
Pvull 2608 
Smal 2994 




EcoRV 17602 
Bglll 17406 

Sad 15853 
EcoRV 15755 
Ctal 15741 
Pvul 15741 

Pvul 15169 

Xhol 14526 

Pvul 14095 
Oral 13940 
Sad 13821 
Pvul! 13811 
Bglll 13539 
Xhol 12928 
Pvul 12420 
Pvull 12285 
Sad 12173 
Bglll 12150 
Smal 12135 
EcoRI 11805 
EcoRI 11657 
Pvul 11517 
Pvull 11268 
Ncol 11174 
EcoRV 11118 

Sail 10481 
Pvul 1 0446 
Xhol 10416 
Bglll 10360 
EcoRV 10229 
Pvul 10030 



Xbal 3135 
Bgtll 3210 
Bglll 3369 
NotI 3388 
BamHI 3395 
Xhol 3400 
BamHI 3717 
EcoRV 4561 
Pvull 4760 
Pvul 4791 
Xhol 4840 
Pvull 5076 
EcoRV 5371 
Sail 5561 
EcoRI 5593 
Pvull 5848 
Ncol 6174 
EcoRI 6583 
Smal 6590 



Xhol 7732 
Pvul 7742 
Dral 8194 
Pvul 8278 
Xhol 8723 
Smal 8887 
EcoRV 8918 
EcoRV 9090 



Figure 13 
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Pvull 121 
EcoRi 296 
Sad 306 
Sad 645 

EcoRV 1549 



EcoRV 17602 
Bglll 17406 

Sad 15853 
EcoRV 15755 
Clal 15741 
Pvul 15741 

Pvul 15169 

Xhol 14526 

Pvul 14095 
Oral 13940 
Sad 13821 
Pvull 13811 
Bglll 13539 
Xhol 12928 
Pvul 12420 
Pvull 12285 
Sad 12173 
Bglll 12150 
Smal 12135 
EcoRI 11805 
EcoRI 11657 
Pvul 11517 
Pvull 11268 
Ncol 11174 
EcoRV 11118 

Sail 10481 
Pvul 10446 
Xhol 10416 
" Bglll 10360 

EcoRV 10229 
Pvul 10030 



Xbal 1578 
EcoRI 1773 
Bam HI 2080 
NotI 2087 
Bglll 2106 
Bglll 2265 
Xbal 2340 
Smal 2485 
Pvull 2871 
Pvull 2940 




PstI 3398 
BamHI 3717 
EcoRV 4561 
Pvull 4760 
Pvul 4791 
Xhol 4840 
Pvull 5076 
EcoRV 5371 
Sail 5581 
EcoRI 5593 
Pvull 5848 
Ncol 6174 
EcoRI 6583 
Smal 6590 



Xhol 7732 
Pvul 7742 
Drai 8194 
Pvul 8278 
Xhol 8723 
Smal 8887 
EcoRV 8918 
EcoRV 9090 
EcoRV 9380 



Figure 14 
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Pvull 121 
EcoR! 296 
Sad 306 
Ncol 346 
StuI 486 
Ncol 736 

EcoRV 



EcoRV 17086 
Bglll 16890 



1444 
jPstI 1670 

Pvull 2124 
Pvull 2193 
Smal 2579 
Xbal 2720 
Bglll 2795 
Bglll 2954 
NotI 2973 
BamHI 2980 
Sad 2997 



Sad 15337 
EcoRV 15239 
Clal 1 
Pvul 15225 

Pvul 14653 




Xhol 14010 

Pvul 13579 
Oral 13424 
Sad 13305 
Pvull 13295 
Bglll 13023 
Xhol 12412 
Pvul 11904 
Pvull 11769 
Sad 11657 
Bgill 11634 
Smal 11619 
EcoRI 11289 
EcoRI 11141 
Pvul 11001 
Pvull 10752 
Ncol 10658 
EcoRV 10602 



Pvull 4244 
Pvul 4275 
Xhol 4324 
Pvull 4560 
EcoRV 4855 
Sail 5065 
EcoRI 5077 
Pvul! 5332 
Ncol 5658 
EcoRI 6067 
Smal 6074 



Xhol 7216 
Pvul 7226 
Oral 7678 
Pvul 7762 
Xhol 8207 
Smal 8371 
EcoRV 8402 
EcoRV 8574 
EcoRV 8864 



Sail 9965 
Pvul 9930 
Xhol 9900 
Bglll 9844 
EcoRV 9713 



Figure 15 
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EcoRI 5107 
BamHI 5076 
Sad 5074 
Kpni 5068 
Hindlll 5058 
Pvull 4879 



BamHI 1 
NotI 8 

StuI 179 

Stu! 332 



Oral 3944 
Oral 3925 



Pvul 3442 




Dra! 3233 



StuI 824 
EcoRV 896 
Xhol 1090 
PstI 1205 
BamHI 1206 
EcoRI 1217 
PstI 1226 
EcoRV 1229 
NotI 1244 
Xhol 1250 
Xbal 1262 

Pvull 1392 
Pvul 1423 



Ncol 2772 



Pvull 2086 
^Bglll 2176 
^Pstl 2393 
Pvull 2446 



Figure 16 
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Pvull 121 
EcoRI 296 
Sad 306 
Saci 645 

EcoRV 



EcoRV 17272 
Bglll 17076 



Sad 15523 
EcoRV 15425 



1549 
Xbal 1576 
EcoRI 1773 
BamHI 2080 
NotI 2087 
Hindlll 2118 
Smal 2436 
Oral 2461 
Ncol 2524 
Pvull 2617 
Ncol 2780 
Ncol 2812 
Ncol 2935 
Pvull 2961 
PstI 3068 



Clal 
Pvul 



15411 
15411 




Pvul 14839 



Xho! 14196 

Pvul 13765 
Oral 13610 
Sad 13491 
Pvull 13481 
Bglll 13209 
Xhol 12598 
Pvul 12090 
Pvull 11955 
Sad 11843 
Bglll 11820 
Smal 11805 
EcoRI 11475 
EcoRI 11327 
Pvul 11187 
Pvull 10938 
Ncol 10844 
" "EcoRV 10788 

Sail 10151 
Pvu! 10116 
Xhol 10086 
Bglll 10030 
EcoRV 9899 



BamHI 3387 
EcoRV 4231 
Pvull 4430 
Pvul 4461 
Xhol 4510 
Pvull 4746 
EcoRV 5041 
Sail 5251 
EcoRI 5263 
Pvull 5518 
Ncol 5844 
EcoRI 6253 
Smal 6260 



Xhol 7402 
Pvul 7412 
Oral 7864 
Pvul 7948 
Xhol 8393 
Smal 8557 
EcoRV 8588 
EcoRV 8760 
EcoRV 9050 
9700 



Figure 17 
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Pvull 121 
EcoRI 296 
Sad 306 
Ncol 346 
Stut 486 
Ncol 736 

EcoRV 1444 



EcoRV 16940 
Bglll 16744 

Sad 15191 
EcoRV 15093 
Clal 15079 
Pvul 15079 

Pvul 14507 - 

Xhol 13864 • 

Pvul 13433 
Oral 13278 
Sad 13159 
Pvull 13149 
Bglll 12877 
Xhol 12266 
Pvul 11758 
Pvull 11623 
Sad 11511 
Bglll 11488 
Smal 11473 
EcoRI 11143 
EcoRi 10995 
Pvu! 10855 
Pvull 10606 
Ncol 10512 
EcoRV 10456 



BamHI 1659 
Not! 1666 
Pvul 1735 
EcoRI 1759 
Smal 1825 
EcoRV 2158 
Ncol 2426 
EcoRV 2654 
Oral 2731 
PstI 2845 
Sad 2851 




Pvull 4098 
Pvul 4129 
Xhol 4178 
Pvull 4414 
EcoRV 4709 
Sail 4919 
EcoRI 4931 
Pvull 5186 
Ncol 5512 
EcoRI 5921 
Smal 5928 



Xhol 7070 
Pvul 7080 
Dral 7532 
Pvul 7616 
Xhol 8061 
Smal 8225 
EcoRV 8256 
EcoRV 8428 
EcoRV 8718 



Sail 9819 
Pvul 9784 
Xhol 9754 
Bglll 9698 



Figure 18 
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PvuM 121 
EcoRI 296 
Sad 306 
Sad 645 

lEcoRV 1549 
Ixbal 1578 




EcoRV 17470 
Bglll 17274 -\ 

Sad 15721 
EcoRV 15623 
Clal 15609 
Pvul 15609 

Pvul 15037 

Xhoi 14394 

Pvul 13963 
Dral 13808 
Sad 13689 
Pvull 13679 
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SEQUENCE LISTING 
<110> Calgene LLC 

<12 0> Nucleic Acid Sequences Involved in 
Tocopherol Synthesis 

<130> 17133/00/WO 

<150> 60/129,899 
<151> 1999-04-15 



<150> 60/146,461 
15 <151> 1999-07-30 

<160> 94 



<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1182 
<212> DNA 

<213> Arabidopsis sp 



<400> 1 

atggagtctc tgctctctag tncttctctt gtttccgctg ctggtgggtt ttgttggaag ' 60 

aagcagaatc taaagctcca ctictttatca gaaatccgag ttctgcgttg tgattcgagt 120 

aaagttgtcg caaaaccgaa gtttaggaac aatcttgtta ggcctgatgg tcaaggatct 180 

3 0 tcattgttgt tgtatccaaa acataagtcg agatttcggg ttaatgccac tgcgggtcag 240. 

cctgaggctt tcgactcgaa ragcaaacag aagtctttta gagactcgtt agatgcgttt 300 

tacaggtttt ctaggcctca tacagttatt ggcacagtgc ttagcatttt atctgtatct: 360 

ttcttagcag tagagaaggt ttctgatata tctcctttac ttttcactgg catcttggag ■ -"420 

gctgttgttg cagctctcat gatgaacatt tacatagttg ggctaaatca gttgtctgat 480 

3 5 gttgaaatag ataaggttaa caagccccat cttccattgg catcaggaga atattctgtt 540 

aacaccggca ttgcaatagc agctnccttc tccatcatga gtttctggct tgggtggatt 600 

gttggttcat ggccattgtr ctgggccctt tttgtgagtt tcatgctcgg tactgcatac 660 

tctatcaatt tgccactttt acggcggaaa agatttgcat tggttgcagc aatgtgtatc 720 

ctcgctgtcc gagctattac cgttcaaatc gccttttatc tacatattca gacacatgtg 780 

40 ttcggaagac caatcttgtc cactaggcct cttattttcg ccactgcgtt tatgagcttt 840 

ntctccgtcg ttattgcatt gtttaaggat atacctgata tcgaagggga taagatattc 900 

ggaatccgat cattctctgt aactctgggt cagaaacggg tgttttggac atgtgttaca 960 

ctacttcaaa tggcttacgc cgtcgcaatt ctagttggag ccacatctcc attcatatgg 1020 

agcaaagtca tctcggttgt: gggtcatgtc atactcgcaa caactttgtg ggctcgagct 1080 

45 aagtccgttg atctgagtag caaaaccgaa ataacttcat gttatatgtc catatggaag 1140 

ctcttttatg cagagtactt gctgttacct tctttgaagt ga 1182 
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<210> 2 
<211> 393 
<212> PRT 
5 <213> Arabidopsis sp 

<400> 2 

Met Glu Ser Leu Leu Ser Ser Ser Ser Leu Val Ser Ala Ala Gly Gly 
15 10 15 

10 Phe Cys Trp Lys Lys Gin Asn Leu Lys Leu His Ser Leu Ser Glu lie 
20 25 30 

Arg Val Leu Arg Cys Asp Ser Ser Lys Val Val Ala Lys Pro Lys Phe 

35 40 45 

Arg Asn Asn Leu Val Arg Pro Asp Gly Gin Gly Ser Ser Leu Leu Leu 
15 50 55 .60 

Tyr Pro Lys His Lys Ser Arg Phe Arg Val Asn Ala Thr Ala Gly Gin 
65 70 75 80 

Pro Glu Ala Phe Asp Ser Asn Ser Lys Gin Lys Ser Phe Arg Asp Ser 
85 90 95 

2 0 Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val lie Gly Thr 

100 105 110 

Val Leu Ser lie Leu Ser Val Ser Phe Leu Ala Val Glu Lys Val Ser 

115 120 125 

Asp lie Ser Pro Leu Leu Phe Thr Gly lie Leu Glu Ala Val Val Ala 
25 130 135 140. 

Ala Leu Met Met Asn lie Tyr lie Val Gly Leu Asn Gin Leu Ser Asp 
145 150 155 160 

Val Glu lie Asp Lys Val Asn Lys Pro Tyr Leu Pro Leu Ala Ser Gly 
165 170 175 

3 0 Glu Tyr Ser Val Asn Thr Gly lie Ala lie Val Ala Ser Phe Ser lie 

180 185 190 

Met Ser Phe Trp Leu Gly Trp lie Val Gly Ser Trp Pro Leu Phe Trp 

195 - . . 200 205 ^ _ 

Ala Leu Phe Val Ser Phe Met Leu Gly Thr Ala Tyr Ser lie Asn Leu 
35 210 215 220 

Pro Leu Leu Arg Trp Lys Arg Phe Ala Leu Val Ala Ala Met Cys lie 
225 230 235 240 

Leu Ala Val Arg Ala He He Val Gin He Ala Phe Tyr Leu His He 
245 250 255 

40 Gin Thr His Val Phe Gly Arg Pro He Leu Phe Thr Arg Pro Leu He 
260 265 270 

Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Val Val He Ala Leu Phe 

275 280 285 

Lys Asp He Pro Asp He Glu Gly Asp Lys He Phe Gly He Arg Ser 
45 290 295 300 

Phe Ser Val Thr Leu Gly Gin Lys Arg Val Phe Trp Thr Cys Val Thr 
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305 310 315 320 

Leu Leu Gin Met Ala Tyr Ala Val Ala lie Leu Val Gly Ala Thr Ser 

325 330 335 

Pro Phe lie Trp Ser Lys Val He Ser Val Val Gly His Val He Leu 
5 340 345 350 

Ala Thr Thr Leu Trp Ala Arg Ala Lys Ser Val Asp Leu Ser Ser Lys 

355 360 365 

Thr Glu He Thr Ser Cys Tyr Met Phe He Trp Lys Leu Phe Tyr Ala 
370 375 380 

10 Glu Tyr Leu Leu Leu Pro Phe Leu Lys 
385 390 

<210> 3 

<211> 1224 

15 <212> DNA 

<213> Arabidopsis sp 

<400> 3 

atggcgtttt ttgggctctc ccgtgtttca agacggttgt tgaaatcttc cgtctccgta 60 

2 0 actccatctt cttcctctgc tcttttgcaa tcacaacata aatccttgtc caatcctgtg -. 120 

actacccatt acacaaatcc tttcactaag tgttatcctt catggaatga taattaccaa 180 

gtatggagta aaggaagaga attgcatcag gagaagtttt ttggtgttgg ttggaattac 240 

agattaattt gtggaatgtc gtcgtcttct tcggttttgg agggaaagcc gaagaaagat 300 

gataaggaga agagtgatgg tgttgttgtt aagaaagctt cttggataga tttgtattta 3 60 

25 ccagaagaag ttagaggtta tgctaagctt gctcgattgg ataaacccat tggaacttgg 42 0 

ttgcttgcgt ggccttgtat gtggtcgatt gcgttggctg ctgatcctgg aagccttcca 480 

agttttaaat atatggcttt atttggttgc ggagcattac ttcttagagg tgctggttgt 540 

actataaatg atctgcttga tcaggacata gatacaaagg ttgatcgtac aaaactaaga 60 0 

cctatcgcca gtggtctttt gacaccattt caagggattg gatttctcgg gctgcagttg 660 

3 0 cttttaggct tagggattct tctccaactt aacaattaca gccgtgtttt aggggcttca 720 

tctttgttac ttgtcttttc ctacccactt atgaagaggt ttacattttg gcctcaagcc 780 

tttttaggtt tgaccataaa ctggggagca ttgttaggat ggactgcagt taaaggaagc B'4'0 

atagcaccat ctattgtact ccctctctat ctctccggag tctgctggac ccttgtttat -...?00 

gatactattt atgcacatca ggacaaagaa gatgatgtaa aagttggtgt taagtcaaca 960 

3 5 gcccttagat tcggtgataa tacaaagctt tggttaactg gatttggcac agcatccata 102.0 

ggttttcttg cactttctgg attcagtgca gatctcgggt ggcaatatta cgcatcactg 1080 

gccgctgcat caggacagtt aggatggcaa atagggacag ctgacttatc atctggtgct 1140 

^ gactgcagta gaaaatttgt gtcgaacaag tggtttggtg ctattatatt tagtggagtt 1200 

gtacttggaa gaagttttca ataa 1224 



40 



45 



<210> 4 
<211> 407 
<212> PRT 

<213> Arabidopsis sp 
<400> 4 



3 
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Met Ala Phe Phe Gly Leu Ser Arg Val Ser Arg Arg Leu Leu Lys Ser 

15 10 15 

Ser Val Ser Val Thr Pro Ser Ser Ser Ser Ala Leu Leu Gin Ser Gin 
20 25 30 

5 His Lys Ser Leu Ser Asn Pro Val Thr Thr His Tyr Thr Asn Pro Phe 
35 40 45 

Thr Lys Cys Tyr Pro Ser Trp Asn Asp Asn Tyr Gin Val Trp Ser Lys 

50 55 60 

Gly Arg Glu Leu His Gin Glu Lys Phe Phe Gly Val Gly Trp Asn Tyr 
10 65 70 75 80 

Arg Leu lie Cys Gly Met Ser Ser Ser Ser Ser Val Leu Glu Gly Lys 

85 90 95 

Pro Lys Lys Asp Asp Lys Glu Lys Ser Asp Gly Val Val Val Lys Lys 
100 105 110 

15 Ala Ser Trp lie Asp Leu Tyr Leu Pro Glu Glu Val Arg Gly Tyr Ala 
115 120 125 

Lys Leu Ala Arg Leu Asp Lys Pro lie Gly Thr Trp Leu Leu Ala Trp 

130 135 140 

Pro Cys Met Trp Ser lie Ala Leu Ala Ala Asp Pro Gly Ser Leu Pro 
20 145 150 155 160 

Ser Phe Lys Tyr Met Ala Leu Phe Gly Cys Gly Ala Leu Leu Leu Arg 

165 170 175 

Gly Ala Gly Cys Thr lie Asn Asp Leu Leu Asp Gin Asp lie Asp Thr 
180 185 190 

25 Lys Val Asp Arg Thr Lys Leu Arg Pro lie Ala Ser . Gly Leu Leu Thr 
195 200 205 

Pro Phe Gin Gly lie Gly Phe Leu Gly Leu Gin Leu Leu Leu Gly Leu 

210 215 220 

Gly lie Leu Leu Gin Leu Asn Asn Tyr Ser Arg Val Leu Gly Ala Ser 
30 225 230 235 240 

Ser Leu Leu Leu Val Phe Ser Tyr Pro Leu Met Lys Arg Phe Thr Phe 

245 250 255 

Trp Pro Gin Ala Phe Leu Gly Leu Thr lie Asn Trp Gly Ala Leu Leu_ 
260 265 270 

35 Gly Trp Thr Ala Val Lys Gly Ser lie Ala Pro Ser lie Val Leu Pro 
275 280 285 

Leu Tyr Leu Ser Gly Val Cys Trp Thr Leu Val Tyr Asp Thr lie Tyr 

290 295 300 

Ala His Gin Asp Lys Glu Asp Asp Val Lys Val Gly Val Lys Ser Thr 
40 305 310 315 320 

Ala Leu Arg Phe Gly Asp Asn Thr Lys Leu Trp Leu Thr Gly Phe Gly 

325 330 335 

Thr Ala Ser lie Gly Phe Leu Ala Leu Ser Gly Phe Ser Ala Asp Leu 
340 345 350 

45 Gly Trp Gin Tyr Tyr Ala Ser Leu Ala Ala Ala Ser Gly Gin Leu Gly 
355 360 365 
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Trp Gin He Gly Thr Ala Asp Leu Ser Ser Gly Ala Asp Cys Ser Arg 

370 375 380 

Lys Phe Val Ser Asn Lys Trp Phe Gly Ala lie He Phe Ser Gly Val 
385 390 395 400 

5 Val Leu Gly Arg Ser Phe Gin 

405 

<210> 5 
<211> 1296 
10 <212> DNA 

<213> Arabidopsis sp 

<400> 5 

atgtggcgaa gatctgttgt ttctcgttta tcttcaagaa tctctgtttc ttcttcgtta 



cctccggtct cgacggaatc aactgctaag ttagggatca ctggtgttag atctgatgcc 

aatcgagttt ttgccactgc tactgccgcc gctacagcta cagctaccac cggtgagatt 

tcgtctagag ttgcggcttt ggctggatta gggcatcact acgctcgttg ttattgggag 

ctttctaaag ctaaacttag tatgcttgtg gttgcaactt ctggaactgg gtatattctg 



60 



15 ccaaacccta gactgattcc ttggtcccgc gaattatgtg ccgttaatag cttctcccag 120 



180 
240 
300 
360 



2 0 ggtacgggaa atgctgcaat tagcttcccg gggctttgtt acacatgtgc aggaaccatg 42 0 



480 



atgattgctg catctgctaa ttccttgaat cagatttttg agataagcaa tgattctaag 
atgaaaagaa cgatgctaag gccattgcct tcaggacgta ttagtgttcc acacgctgtt 540 

600 
660 
720 



900 
960 
1020 



gcatgggcta ctattgctgg tgcttctggt gcttgtttgt tggccagcaa gactaatatg 

ttggctgctg gacttgcatc tgccaatctt gtactttatg cgtttgttta tactccgttg 

2 5 aagcaacttc accctatcaa tacatgggtt ggcgctgttg ttggtgctat cccacccttg 

cttgggtggg cggcagcgtc tggtcagatt tcatacaatt cgatgattct tccagctgct 780 

ctttactttt ggcagatacc tcattttatg gcccttgcac atctctgccg caatgattat . 840 

gcagctggag gttacaagat gttgtcactc tttgatccgt cagggaagag aatagcagca 

gtggctctaa ggaactgctt ttacatgatc cctctcggtt tcatcgccta tgactggggg 

3 0 ttaacctcaa gttggttttg cctcgaatca acacttctca cactagcaat cgctgcaaca 

gcattttcat tctaccgaga ccggaccatg cataaagcaa ggaaaatgtt ccatgccagt 1080 

cttctcttcc ttcctgtttt catgtctggt cttcttctac accgtgtctc taatgataat 1140 

cagcaacaac tcgtagaaga agccggatta acaaattctg tatctggtga agtcaaaact 1200 

cagaggcgaa agaaacgtgt ggctcaacct ccggtggctt atgcctctgc tgcaccgttt 1260 

3 5 cctttcctcc cagctccttc cttctactct ccatga 1296 

<210> 6 
<211> 431 
<212> -PRT 
40 <213> Arabidopsis sp 

<400> 6 

Met Trp Arg Arg Ser Val Val Tyr Arg Phe Ser Ser Arg He Ser Val 

1 5 10 15 
45 Ser Ser Ser Leu Pro Asn Pro Arg Leu He Pro Trp Ser Arg Glu Leu 

20 25 30 
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Cys Ala Val Asn Ser Phe Ser Gin Pro Pro Val Ser Thr Glu Ser Thr 

35 40 45 

Ala Lys Leu Gly lie Thr Gly Val Arg Ser Asp Ala Asn Arg Val Phe 
50 55 60 

5 Ala Thr Ala Thr Ala Ala Ala Thr Ala Thr Ala Thr Thr Gly Glu lie 
65 70 75 80 

Ser Ser Arg Val Ala Ala Leu Ala Gly Leu Gly His His Tyr Ala Arg 

85 90 95 

Cys Tyr Trp Glu Leu Ser Lys Ala Lys Leu Ser Met Leu Val Val Ala 
10 100 105 110 

Thr Ser Gly Thr Gly Tyr lie Leu Gly Thr Gly Asn Ala Ala lie Ser 

115 120 125 

Phe Pro Gly Leu Cys Tyr Thr Cys Ala Gly Thr Met Met lie Ala Ala 
130 135 140 

15 Ser Ala Asn Ser Leu Asn Gin lie Phe Glu lie Ser Asn Asp Ser Lys 
145 150 155 160 

Met Lys Arg Thr Met Leu Arg Pro Leu Pro Ser Gly Arg lie Ser Val 

165 170 175 

Pro His Ala Val Ala Trp Ala Thr He Ala Gly Ala Ser Gly Ala Cys 
20 180 185 190 

Leu Leu Ala Ser Lys Thr Asn Met Leu Ala Ala Gly Leu Ala Ser Ala 

195 200 205 

Asn Leu Val Leu Tyr Ala Phe Val Tyr Thr Pro Leu Lys Gin Leu His 
210 215 220 

25 Pro He Asn Thr Trp Val Gly Ala Val Val Gly Ala He Pro Pro Leu 
225 230 235 240 

Leu Gly Trp Ala Ala Ala Ser Gly Gin He Ser Tyr Asn Ser Met He 

245 250 255 

Leu Pro Ala Ala Leu Tyr Phe Trp Gin He Pro His Phe Met Ala Leu 
30 260 265 270 

Ala His Leu Cys Arg Asn Asp Tyr Ala Ala Gly Gly Tyr Lys Met Leu 

275 280 285 

Ser Leu Phe Asp -Pro Ser Gly Lys Arg He Ala Ala Val Ala Leu Arg_ 
290 295 300 

35 Asn Cys Phe Tyr Met He Pro Leu Gly Phe He Ala Tyr Asp Trp Gly 
305 310 315 320 

Leu Thr Ser Ser Trp Phe Cys Leu Glu Ser Thr Leu Leu Thr Leu Ala 

325 330 335 

He Ala Ala Thr Ala Phe Ser Phe Tyr Arg Asp Arg Thr Met His Lys 
40 340 345 350 

Ala Arg Lys Met " Phe His Ala Ser Leu Leu Phe Leu Pro Val Phe Met 

355 360 365 

Ser Gly Leu Leu Leu His Arg Val Ser Asn Asp Asn Gin Gin Gin Leu 
370 375 380 

45 Val Glu Glu Ala Gly Leu Thr Asn Ser Val Ser Gly Glu Val Lys Thr 
385 390 395 400 
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Gin Arg Arg Lys Lys Arg Val Ala Gin Pro Pro Val Ala Tyr Ala Ser 

405 410 415 

Ala Ala Pro Phe Pro Phe Leu Pro Ala Pro Ser Phe Tyr Ser Pro 
420 425 430 

5 

<210> 7 
<211> 479 
<212> DNA 

<213> Arabidopsis sp 

10 

<400> 7 



ggaaactccc ggagcacctg tttgcaggta ccgctaacct taatcgataa tttatttctc 60 

ttgtcaggaa ttatgtaagt ctggtggaag gctcgcatac catttttgca ttgcctttcg 120 

ctatgatcgg gtttactttg ggtgtgatga gaccaggcgt ggctttatgg tatggcgaaa 18 0 

15 acccattttt atccaatgct gcattccctc ccgatgattc gttctttcat tcctatacag 240 

gtatcatgct gataaaactg ttactggtac tggtttgtat ggtatcagca agaagcgcgg 3 00 

cgacggcgtt taaccggtat ctcgacaggc attttgacgc gaagaacccg cgtactgcca 3 60 

tccgtgaaat acctgcgggc gtcatatctg ccaacagtgc gctggtgttt acgataggct 420 

gctgcgtggt attctgggtg gcctgttatt tcattaacac gatctgtttt tacctggcg 479 



20 

<210> 8 
<211> 551 
<212> DNA 

<213> Arabidopsis sp 

25 

<220> 

<221> misc_feature 
<222> (1) . . . (551) 
<223> n = A,T,C or G 

30 

<400> 8 



ttgtggctta caccttaatg agcatacgcc agnccattac ggctcgttaa tcggcgccat 6 0 

ngccggngct gntgcac.cgg tagtgggcta ctgcgccgtg accaatcagc ttgatctagc 12 0 

ggctcttatt ctgtttttaa ttttactgtt ctggcaaatg ccgcattttt acgcgatttc 180 

3 5 cattttcagg ctaaaagact tttcagcggc ctgtattccg gtgctgccca tcattaaaga 240 

cctgcgctat accaaaatca gcatgctggt ttacgtgggc ttatttacac tggctgctat 3 00 

catgccggcc ctcttagggt atgccggttg gatttatggg atagcggcct taattttagg 3 60 

cttgtattgg ctttatattg ccatacaagg attcaagacc gccgatgatc aaaaatggtc 420 

tcgtaagatg tttggatctt cgattttaat cactaccctc ttgtcggtaa tgatgcttgt 4 80 

40 ttaaacttac tgcctcctga agtttatata tcgataattt cagcctaagg aggcttagtg 540 

gttaattcaa t 551 



<210> 9 
<211> 297 
4 5 <212> PRT 

<213> Arabidopsis sp 
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<400> 9 

Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 
15 10 15 

5 Phe Lys Arg Gly Val Gin Gly Lys Gin Phe Arg Ser Thr lie Leu Leu 
20 25 30 

Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu lie Gly 

35 40 45 

Glu Ser Thr Asp lie Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 
10 50 55 60 

lie Ala Glu lie Thr Glu Met lie His Val Ala Ser Leu Leu His Asp 
65 70 75 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 
85 90 95 

15 Val Val Met Gly Asn Lys Val Val Ala Leu Leu Ala Thr Ala Val Glu 
100 105 110 

His Leu Val Thr Gly Glu Thr Met Glu He Thr Ser Ser Thr Glu Gin 

115 120 125 

Arg Tyr Ser Met Asp Tyr Tyr Met Gin Lys Thr Tyr Tyr Lys Thr Ala 
20 130 135 140 

Ser Leu He Ser Asn Ser Cys Lys Ala Val Ala Val Leu Thr Gly Gin 
145 150 155 160 

Thr Ala Glu Val Ala Val Leu Ala Phe Glu Tyr Gly Arg Asn Leu Gly 
165 170 175 

25 Leu Ala Phe Gin Leu He Asp Asp He Leu Asp Phe Thr Gly Thr Ser 
180 185 190 

Ala Ser Leu Gly Lys Gly Ser Leu Ser Asp He Arg His Gly Val He 

195 200 205 

Thr Ala Pro He Leu Phe Ala Met Glu Glu Phe Pro Gin Leu Arg Glu 
30 210 215 220 

Val Val Asp Gin Val Glu Lys Asp Pro Arg Asn Val Asp He Ala Leu 
225 230 235 240 

Glu Tyr Leu Gly -Lys Ser Lys Gly He Gin Arg Ala Arg Glu Leu Ala_ 
245 250 255 

3 5 Met Glu His Ala Asn Leu Ala Ala Ala Ala He Gly Ser Leu Pro Glu 
260 265 270 

Thr Asp Asn Glu Asp Val Lys Arg Ser Arg Arg Ala Leu He Asp Leu 

275 280 285 

Thr His Arg Val He Thr Arg Asn Lys 
40 290 295 

<210> 10 
<211> 561 
<212> DNA 
45 <213> Arabidopsis sp 
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<400> 10 

aagcgcatcc gtcctcttct acgattgccg ccagccgcat gtatggctgc ataaccgacc 60 

gcccctatcc gctcgcggcc gcggtcgaat tcattcacac cgcgacgctg ctgcatgacg 12 0 

acgtcgtcga tgaaagcgat ttgcgccgcg gccgcgaaag cgcgcataag gttttcggca 180 

5 atcaggcgag cgtgctcgtc ggcgatttcc ttttctcccg cgccttccag ctgatggtgg 240 

aagacggctc gctcgacgcg ctgcgcattc tctcggatgc ctccgccgtg atcgcgcagg 3 00 

gcgaagtgat gcagctcggc accgcgcgca atcttgaaac caatatgagc cagtatctcg 3 60 

atgtgatcag cgcgaagacc gccgcgctct ttgccgccgc ctgcgaaatc ggcccggtga 420 

tggcgaacgc gaaggcggaa gatgctgccg cgatgtgcga atacggcatg aatctcggta 48 0 

10 tcgccttcca gatcatcgac gaccttctcg attacggcac cggcggccac gccgagcttg 540 

gcaagaacac gggcgacgat t 561 



<210> 11 

<211> 966 

15 <212> DNA 

<213> Arabidopsis sp 



<400> 11 

atggtacttg ccgaggttcc aaagcttgcc tctgctgctg agtacttctt caaaaggggt 6 0 

2 0 gtgcaaggaa aacagtttcg ttcaactatt ttgctgctga tggcgacagc tctgaatgta . 120 

cgcgttccag aagcattgat tggggaatca acagatatag tcacatcaga attacgcgta 180 

aggcaacggg gtattgctga aatcactgaa atgatacacg tcgcaagtct actgcacgat 2 40 

gatgtcttgg atgatgccga tacaaggcgt ggtgttggtt ccttaaatgt tgtaatgggt 300 

aacaagatgt cggtattagc aggagacttc ttgctctccc gggcttgtgg ggctctcgct 3 60 

25 gctttaaaga acacagaggt tgtagcatta cttgcaactg ctgtagaaca tcttgttacc 42 0 

ggtgaaacca tggaaataac tagttcaacc gagcagcgtt atagtatgga ctactacatg 4 80 

cagaagacat attataagac agcatcgcta atctctaaca gctgcaaagc tgttgccgtt 540 

ctcactggac aaacagcaga agttgccgtg ttagcttttg agtatgggag gaatctgggt 60 0 

ttagcattcc aattaataga cgacattctt gatttcacgg gcacatctgc ctctctcgga 660 

30 aagggatcgt tgtcagatac tcgccatgga gtcataacag ccccaatcct ctttgccatg 72 0 

gaagagtttc ctcaactacg cgaagttgtt gatcaagttg aaaaagatcc taggaatgtt 780 

gacattgctt tagagtatct tgggaagagc aagggaatac agagggcaag agaattagcc 8'40 

atggaacatg cgaatctagc agcagctgca atcgggtctc tacctgaaac agacaatg^a 

gatgtcaaaa gatcgaggcg ggcacttatt gacttgaccc atagagtcat caccagaaac 960 

3 5 aagtga 96 6 



<210> 12 

<211> 321 

<212> PRT 

40 <213> Arabidopsis sp 



<400> 12 

Met Val Leu Ala Glu Val Pro Lys 
1 5 
45 Phe Lys Arg Gly Val Gin Gly Lys 
20 



Leu Ala Ser Ala Ala Glu Tyr Phe 

10 15 

Gin Phe Arg Ser Thr lie Leu Leu 
25 30 
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Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu lie Gly 

35 40 45 

Glu Ser Thr Asp lie Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 
50 55 60 

5 lie Ala Glu lie Thr Glu Met lie His Val Ala Ser Leu Leu His Asp 
65 70 75 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 

85 90 95 

Val Val Met Gly Asn Lys Met Ser Val Leu Ala Gly Asp Phe Leu Leu 
10 100 105 110 

Ser Arg Ala Cys Gly Ala Leu Ala Ala Leu Lys Asn Thr Glu Val Val 

115 120 125 

Ala Leu Leu Ala Thr Ala Val Glu His Leu Val Thr Gly Glu Thr Met 
130 135 140 

15 Glu lie Thr Ser Ser Thr Glu Gin Arg Tyr Ser Met Asp Tyr Tyr Met 
145 150 155 160 

Gin Lys Thr Tyr Tyr Lys Thr Ala Ser Leu lie Ser Asn Ser Cys Lys 

165 170 175 

Ala Val Ala Val Leu Thr Gly Gin Thr Ala Glu Val Ala Val Leu Ala 
20 180 185 190 

Phe Glu Tyr Gly Arg Asn Leu Gly Leu Ala Phe Gin Leu lie Asp Asp 

195 200 205 

lie Leu Asp Phe Thr Gly Thr Ser Ala Ser Leu Gly Lys Gly Ser Leu 
210 215 220 

25 Ser Asp lie Arg His Gly Val lie Thr Ala Pro lie Leu Phe Ala Met 
225 230 235 240 

Glu Glu Phe Pro Gin Leu Arg Glu Val Val Asp Gin Val Glu Lys Asp 

245 250 255 

Pro Arg Asn Val Asp lie Ala Leu Glu Tyr Leu Gly Lys Ser Lys Gly 
30 260 265 270 

lie Gin Arg Ala Arg Glu Leu Ala Met Glu His Ala Asn Leu Ala Ala 

275 280 285 

Ala Ala lie Gly Ser. Leu Pro Glu Thr Asp Asn Glu Asp Val Lys Arg_ 
290 295 300 

35 Ser Arg Arg Ala Leu lie Asp Leu Thr His Arg Val lie Thr Arg Asn 
305 310 315 320 

Lys 



40 <210> 13 
<211> 621 
<212> DNA 

<213> Arabidopsis sp 
45 <400> 13 

gctttctcct ttgctaattc ttgagctttc ttgatcccac cgcgatttct aactatttca 60 
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atcgcttctt caagcgatcc aggctcacaa aactcagact caatgatctc tcttagcctt 120 

ggctcattct ctagcgcgaa gatcactggc gccgttatgt tacctttggc taagtcatta 180 

gctgcaggct tacctaactg ctctgtggac tgagtgaagt ccagaatgtc atcaactact 240 

tgaaaagata aaccgagatt: cttcccgaac tgatacattt gctctgcgac cttgctttcg 3 00 

5 actttactga aaattgctgc tcctttggtg cttgcagcta ctaatgaagc tgtcttgtag 360 

taactcttta gcatgcagtc atcaagcttg acatcacaat cgaataaact cgatgcttgc 420 

tttatctcac cgcttgcaaa atctttgatc acctgcaaaa agataaatca agattcagac 480 

caaatgttct ttgtattgag tagcttcatc taatctcaga aaggaatatt acctgactta 540 

tgagcttaat gacttcaagg ttttcgagat ttgtaagtac catgatgctt gagcaacatg 600 

10 aaatccccag ctaatacagc t 621 



<210> 14 

<211> 741 

<212> DNA 

15 <213> Arabidopsis sp 



<400> 14 

ggtgagtttt gttaatagtt: atgagattca tctatttttg tcataaaatt gtttggtttg 60 

gtttaaactc tgtgtataat tgcaggaaag gaaacagttc atgagctttt cggcacaaga 12 0 

20 gtagcggtgc tagctggaga tttcatgttt gctcaagcgt catggtactt agcaaatctc . 180 

gagaatcttg aagttattaa gctcatcagt caggtactta gttactctta cattgttttt 240 

ctatgaggtt gagctatgaa tctcatttcg ttgaataatg ctgtgcctca aacttttttt 300 

catgttttca ggtgatcaaa gactttgcaa gcggagagat aaagcaggcg tccagcttat. 3 60 

ttgactgcga caccaagctc gacgagtact tactcaaaag tttctacaag acagcctctt 42 0 

25 tagtggctgc gagcaccaaa ggagctgcca ttttcagcag agttgagcct gatgtgacag 480 

aacaaatgta cgagtttggg aagaatctcg gtctctcttt ccagatagtt gatgatattt 540 

tggatttcac tcagtcgaca gagcagctcg ggaagccagc agggagtgat ttggctaaag 600 

gtaacttaac agcacctgtg attttcgctc tggagaggga gccaaggcta agagagatca 660 

ttgagtcaaa gttctgtgag gcgggttctc tggaagaagc gattgaagcg gtgacaaaag 720 

3 0 gtggggggat taagagagca c 741 



<210> 15 

<211> 1087 _ - . . 

<212> DNA 
3 5 <213> Arabidopsis sp 



<400> 15 

cctcttcagc caatccagag gaagaagaga caacttttta tctttcgtca agagtctccg 60 

aaaacgcacg gttttatgct ctctcttctg ccctcacctc acaagacgca gggcacatga 120 

40 ttcaaccaga gggaaaaagc aacgataaca actctgcttt tgatttcaag ctgtatatga 18 0 

tccgcaaagc cgagtctgta aatgcggctc tcgacgtttc cgtaccgctt ctgaaacccc 240 

ttacgatcca agaagcggtc aggtactctt tgctagccgg cggaaaacgt gtgaggcctc 3 00 

tgctctgcat tgccgcttgt gagcttgtgg ggggcgacga ggctactgcc atgtcagccg 3 60 

cttgcgcggt. cgagatgatc cacacaagct ctctcattca tgacgatctt ccgtgcatgg 420 

45 acaatgccga cctccgtaga ggcaagccca ccaatcacaa ggtatgttgt ttaattatat 480 

gaaggctcag agataatgct gaactagtgt tgaaccaatt tttgctcaaa caaggtatat 540 
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ggagaagaca 


tggcggtttt 


ggcaggtgat 


gcactccttg 


cattggcgtt 


tgagcacatg 


600 




acggttgtgt 


cgagtgggtt 


ggtcgctccc 


gagaagatga 


ttcgcgccgt 


ggttgagctg 


660 




gccagggcca 


tagggactac 


agggctagtt 


gctggacaaa 


tgatagacct 


agccagcgaa 


720 




agactgaatc 


cagacaaggt 


tggattggag 


catctagagt 


tcatccatct 


ccacaaaacg 


780 


5 


gcggcattgt 


tggaggcagc 


ggcagtttta 


ggggttataa 


tgggaggtgg 


aacagaggaa 


840 




gaaatcgaaa 


agcttagaaa 


■gtatgctagg 


tgtattggac 


tactgtttca 


ggttgttgat 


900 




gacattctcg 


acgtaacaaa 


atctactgag 


gaattgggta 


agacagccgg 


aaaagacgta 


960 




atggccggaa 


agctgacgta 


tccaaggctg 


ataggtttgg 


agggatccag 


ggaagttgca 


1020 




gagcacctga 


ggagagaagc 


agaggaaaag 


cttaaagggt 


ttgatccaag 


tcaggcggcg 


1080 


10 


cctctgg 












1087 




<210> 16 
















<211> 1164 
















<212> DNA 














15 


<213> Arabidopsis sp 














<400> 16 
















atgac ttcga 


ttctcaacac 


tgtctccacc 


atccactctt 


ccagagttac 


ctccgtcgat 


60 




cgagtcggag 


tcctctctct 


tcggaattcg 


gattccgttg 


agttcactcg 


ccggcgttct 


120 


20 


ggtttctcga 


cgttgatcta 


cgaatcaccc 


gggcggagat 


ttgttgtgcg 


tgcggcggag 


. 180 




actgatactg 


ataaagttaa 


atctcagaca 


cctgacaagg 


caccagccgg 


tggttcaagc 


240 




attaaccagc 


ttctcggtat 


caaaggagca 


tctcaagaaa 


ctaataaatg 


gaagattcgt 


300 




cttcagctta 


caaaaccagt 


cacttggcct 


ccactggttt 


ggggagtcgt 


ctgtggtgct 


360 




gctgcttcag 


ggaactttca 


ttggacccca 


gaggatgttg 


ctaagtcgat 


tctttgcatg 


420 


25 


atgatgtctg 


gtccttgtc t 


tactggctat 


acacagacaa 


tcaacgactg 


gtatgataga 


480 




gatatcgacg 


caattaatga 


gccatatcgt 


ccaattccat 


ctggagcaat 


atcagagcca 


540 




gaggttatta 


cacaagtctg 


ggtgctatta 


ttgggaggtc 


ttggtattgc 


tggaatatta 


600 




gatgtgtggg 


cagggcatac 


cactcccact 


gtcttctatc 


ttgctttggg 


aggatcattg 


660 




ctatcttata 


tatactctgc 


tccacctctt 


aagctaaaac 


aaaatggatg 


ggttggaaat 


720 


30 


tttgcacttg 


gagcaagcta 


tattagtttg 


ccatggtggg 


ctggccaagc 


attgtttggc 


780 




actcttacgc 


cagatgttgt 


t.gtt:ct.aaca 


ctcttgtaca 


gcatagctgg 


gttaggaata 


840 




gccattgtta 


acgacttcaa 


aagtgttgaa 


ggagatagag 


cattaggact 


tcagtctctc 


900 




ccagtagctt 


ttggeaccga 


aactgcaaaa 


tggatatgcg 


ttggtgctat 


agacattact 


960 




cagctttctg 


ttgccggata 


tctattagca 


tctgggaaac 


cttattatgc 


gttggcgttg 


i020 


35 


gttgctttga 


tcattcctca 


gattgtgttc 


cagtttaaat 


actttctcaa 


ggaccctgtc 


1080 




aaatacgacg 


tcaagtacca 


ggcaagcgcg 


cagccattct 


tggtgctcgg 


aatatttgta 


1140 




acggcattag 


catcgcaaca 


ctga 








1164 




<210>..17 














40 


<211> 387 
















<212> PRT 
















<213> Arabidopsis sp 














<400> 17 














45 


Met Thr Ser lie Leu Asn Thr Val , 


Ser Thr lie 


His Ser Ser Arg Val 





15 10 15 



12 
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Thr Ser Val Asp Arg Val Gly Val Leu Ser Leu Arg Asn Ser Asp Ser 

20 25 30 

Val Glu Phe Thr Arg Arg Arg Ser Gly Phe Ser Thr Leu lie Tyr Glu 
35 40 45 

5 Ser Pro Gly Arg Arg Phe Val Val Arg Ala Ala Glu Thr Asp Thr Asp 
50 55 60 

Lys Val Lys Ser Gin Thr Pro Asp Lys Ala Pro Ala Gly Gly Ser Ser 
65 70 75 80 

lie Asn Gin Leu Leu Gly lie Lys Gly Ala Ser Gin Glu Thr Asn Lys 
10 85 90 95 

Trp Lys lie Arg Leu Gin Leu Thr Lys Pro Val Thr Trp Pro Pro Leu 

100 105 110 

Val Trp Gly Val Val Cys Gly Ala Ala Ala Ser Gly Asn Phe His Trp 
115 120 125 

15 Thr Pro Glu Asp Val Ala Lys Ser lie Leu Cys Met Met Met Ser Gly 
130 135 140 

Pro Cys Leu Thr Gly Tyr Thr Gin Thr lie Asn Asp Trp Tyr Asp Arg 
145 150 155 160 

Asp lie Asp Ala lie Asn Glu Pro Tyr Arg Pro lie Pro Ser Gly Ala 
20 165 170 175 

lie Ser Glu Pro Glu Val lie Thr Gin Val Trp Val Leu Leu Leu Gly 

180 185 190 

Gly Leu Gly He Ala Gly He Leu Asp Val Trp Ala Gly His Thr Thr 
195 200 205 

2 5 Pro Thr Val Phe Tyr Leu Ala Leu Gly Gly Ser Leu Leu Ser Tyr He 
210 215 220 

Tyr Ser Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Val Gly Asn 
225 230 235 240 

Phe Ala Leu Gly Ala Ser Tyr He Ser Leu Pro Trp Trp Ala Gly Gin 
30 245 250 255 

Ala Leu Phe Gly Thr Leu Thr Pro Asp Val Val Val Leu Thr Leu Leu 

260 265 270 

Tyr Ser He Ala Gly Leu Gly He Ala He Val Asn Asp Phe Lys Ser_ 
275 280 285 

35 Val Glu Gly Asp Arg Ala Leu Gly Leu Gin Ser Leu Pro Val Ala Phe 
290 295 300 

Gly Thr Glu Thr Ala Lys Trp He Cys Val Gly Ala He Asp He Thr 
305 310 315 320 

Gin Leu Ser Val Ala Gly Tyr Leu Leu Ala Ser Gly Lys Pro Tyr Tyr 
40 325 330 335 

Ala Leu Ala Leu Val Ala Leu He He Pro Gin He Val Phe Gin Phe 

340 345 350 

Lys Tyr Phe Leu Lys Asp Pro Val Lys Tyr Asp Val Lys Tyr Gin Ala 
355 360 365 

45 Ser Ala Gin Pro Phe Leu Val Leu Gly He Phe Val Thr Ala Leu Ala 
370 375 380 
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Ser Gin His 
385 

<210> 18 
5 <211> 981 
<212> DNA 

<213> Arabidopsis sp 





<400> 18 














10 


atgttgttta 


gtggttcagc 


gatcccatta 


agcagcttct 


gctctcttcc 


ggagaaaccc 


60 




cacactcttc 


ctatgaaact 


ctctcccgct 


gcaatccgat 


cttcatcctc 


atctgccccg 


120 




SSrgtcgttga 


acttcgatct 


gaggacgtat 


tggacgactc 


tgatcaccga 


gatcaaccag 


180 




aagctggatg 


aggccatacc 


ggtcaagcac 


cctgcgggga 


tctacgaggc 


tatgagatac 


240 




tctgtactcg 


cacaacrcrccfc 


caagcgtgcc 


cc tcctgtga 


t cr t CT t a t or CTC 


crcT f f t~ cr fT r5 CT 

yyv-^ i„y\^ya.y 


3 00 


15 


ctcttcggtg 


gcgatcgcct 


cgccgctttc 


cccaccgcct 


gtgccctaga 


aatggtgcac 


360 




gcggcttcgt 


tgatacacga 


cgacctcccc 


tgtatggacg 


acgatcctgt 


gcgcagagga 


420 




aagccatcta 


accacactgt 


ctacggctct 


ggcatggcca 


ttctcgccgg 


tgacgccctc 


480 




ttcccactcg 


ccttccagca 


cattgtctcc 


cacacgcctc 


ctgaccttgt 


tccccgagcc 


540 




accatcctca 


gactcatcac 


tgagattgcc 


cgcac tg teg 


gctccac tgg 


tat acre tcrca 
^y y ^ ^ y ^<-* 


600 


20 


ggccagtacg 


tcgaccttga 


aggaggtccc 


tttcctcttt 


cctttgttca 


ggagaagaaa 


660 




ttcggagcca 


tgggtgaatg 


ctctgccgtg 


tgcggtggcc 


tattgggcgg 


tgccactgag 


720 




gatgagctcc 


agagtctccg 


aaggtacggg 


agagccgtcg 


ggatgctgta 


tcaggtggtc 


780 




gatgacatca 


ccgaggacaa 


gaagaagagc 


tatgatggtg 


gagcagagaa 


gggaatgatg 


840 




gaaatggcgg 


aagagctcaa 


ggagaaggcg 


aagaaggagc 


ttcaagtgtt 


tgacaacaag 


900 


25 


tatggaggag 


gagacacact 


tgttcctctc 


tacaccttcg 


ttgactacgc 


tgctcatcga 


960 




cattttcttc 


ttcccctctg 




• 






981 




<210> 19 
















<211> 245 














30 


<212> DNA 
















<213> GLycine sp 














<400> 19 
















gcaacatctg 


ggactgggtt 


tgtcttgggg 


sgtggtagtg 


ctgttgatct 


ttcggcactt 


60 


35 


tcttgcactt 


gcttgggtac 


catgatggtt 


gctgcatctg 


ctaactcttt 


gaatcaggtg 


120 




tttgagatca 


ataatgatgc 


taaaatgaag 


agaacaagtc 


gcaggccact 


accctcagga 


180 




cgcatcacaa 


tacctcatgc 


agttggctgg 


gcatcctctg 


ttggattagc 


tggtacggct 


240 




ctact 












245 


40 


<210> 20 
















<211> 253 
















<212> DNA 
















<213> Glycine sp 












45 


<400> 20 
















attggctttc 


caagatcatt 


gggttttctt 


gttgcattca 


tgaccctcta 


ctccttgggt 


60 



14 



BNSDCXID: <WO 006339 1A2J_> 



wo 00/63391 



■ PCT/USOO/10368 



10 



ttggcattgt ccaaggatat acctgacgtt gaaggagata aagagcacgg cattgattct 120 

tttgcagtac gtctaggtca gaaacgggca ttttggattt gcgtttcctt ttttgaaatg 180 

gctttcggag ttggtatcct ggccggagca tcatgctcac acttttggac taaaattttc 240 

acgggtatgg gaa 2 53 

<210> 21 
<211> 275 
<212> DNA 
<213> Glycine sp 



60 



<400> 21 

tgatcttcta ctctctgggt atggcattgt ccaaggatat atctgacgtt aaaggagata 

aagcatacgg catcgatact ttagcgatac gtttgggtca aaaatgggta ttttggattt 12 0 

gcattatcct ttttgaaatg gcttttggag ttgccctctt ggcaggagca acatcttctt 180 

15 acctttggat taaaattgtc acgggtctgg gacatgctat tcttgcttca attctcttgt 240 

accaagccaa atctatatac ttgagcaaca aagtt 27 5 

<210> 22 
<211> 299 
20 <212> DNA 

<213> Glycine sp 

<220> 

<221> misc_f eature 
25 <222> (1) . . . (299) 

<223> n = A,T,C or G 

<400> 22 

ccanaatang tncatcttng aaagacaatt ggcctcttca acacacaagt ctgcatgtga 6 0 

3 0 agaagaggcc aattgtcttt ccaagatcac ttatngtggc tattgtaatc atgaacttct 12 0 

tctttgtggg tatggcattg gcaaaggata tacctanctg ttgaaggaga taaaatatat 180 

ggcattgata cttttgcaat acgtataggt caaaaacaag tattttggat ttgtattttc 240 
ctttttgaaa ggctttcgga gtttccctag tggcaggagc aacatcttct agccttggt^ 

35 <210> 23 

<211> 767 

<212> DNA 

<213> Glycine sp 

40 <400> 23 

gtggaggctg tggttgctgc cctgtttatg aatatttaca ttgttggttt gaatcaattg 60 

tctgatgtcg aaatagacaa gataaacaag ccgtatcttc cattagcatc tggggaatat 120 

tcctttgaaa ctggtgtcac tattgttg'ca tctttttcaa ttctgagttt ttggcttggc 180 

tgggttgtag gttcatggcc attattttgg gccctttttg taagctttgt gctaggaact 240 

45 gcttattcaa tcaatgtgcc tctgttgaga tggaagaggt ttgcagtgct tgcagcgacg 300 

tgcattctag ctgttcgggc agtaatagct caacttgcat ttttccttca catgcagact 3 60 



15 
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catgtgtaca agaggccacc tgtcttttca agaccattga tttttgctac tgcattcatg 420 

agcttcttct ctgtagttat agcactgttt aaggatatac ctgacattga aggagataaa 480 

gtatttggca tccaatcttt ttcagtgtgt ttaggtcaga agccggtgtt ctggacttgt 540 

gttacccttc ttgaaatagc ttatggagtc gccctcctgg tgggagctgc atctccttgt 600 

5 ctttggagca aaattttcac gggtctggga cacgctgtgc tggcttcaat tctctggttt 660 

catgccaaat ctgtagattt gaaaagcaaa gcttcgataa catccttcta tatgtttatt 720 

tggaagctat tttatgcaga atacttactc attccttttg ttagatg 767 

<210> 24 
10 <211> 255 
<212> PRT 
<213> Glycine sp 

<400> 24 

15 Val Glu Ala Val Val Ala Ala Leu Phe Met Asn .lie Tyr lie Val Gly 
15 10 15 

Leu Asn Gin Leu Ser Asp Val Glu lie Asp Lys lie Asn Lys Pro Tyr 

20 25 30 

Leu Pro Leu Ala Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr lie 
20 35 40 45 

Val Ala Ser Phe Ser lie Leu Ser Phe Trp Leu Gly Trp Val Val Gly 

50 55 60 

Ser Trp Pro Leu Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr 
65 70 75 80 

25 Ala Tyr Ser lie Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val 

85 90 95 

Leu Ala Ala Met Cys lie Leu Ala Val Arg Ala Val He Val Gin Leu 

100 105 110 

Ala Phe Phe Leu His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val 
30 115 120 125 

Phe Ser Arg Pro Leu He Phe Ala Thr Ala Phe Met Ser Phe Phe Ser 

130 135 140 f 

Val Val He Ala Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys_ 
145 150 155 ^ 160 

35 Val Phe Gly He Gin Ser Phe Ser Val Cys Leu Gly Gin Lys Pro Val 

165 170 175 

Phe Trp Thr Cys Val Thr Leu Leu Glu He Ala Tyr Gly Val Ala Leu 

180 185 190 

Leu Val Gly Ala Ala Ser Pro Cys Leu Trp Ser Lys He Phe Thr Gly 
40 195 200 205 

Leu Gly His Ala Val Leu Ala Ser He Leu Trp Phe His Ala Lys Ser 

210 215 220 

Val Asp Leu Lys Ser Lys Ala Ser He Thr Ser Phe Tyr Met Phe He 
225 230 235 240 

45 Trp Lys Leu Phe Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 

245 250 255 
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<210> 25 
<211> 360 
<212> DNA 
5 <213> Zea sp 

<220> 

<221> inisc_f eature 
<222> (1) . . . (360) 
10 <223> n = A,T,C or G 

<400> 25 

ggcgtcttca cttgttctgg tcttctcgta tcccctgatg aagaggttca cattttggcc 60 

tcaggcttat cttggcctga cattcaactg gggagcttta ctagggtggg ctgctattaa 12 0 

15 ggaaagcata gaccctgcaa atcatccttc cattgtatac . agctggtatt tgttggacgc 180 

tggtgtatga tactatatat gcgcatcagg tgtttcgcta tccctacttt catattaatc 240 

cttgatgaag tggccatttc atgttgtcgc ggtggtctta tacttgcata tctccatgca 3 00 

tctcaggaca aagangatga cctgaaagta ggagtccaag tccacagctt aagatttggg 3 60 

20 <210> 26 
<211> 299 
<212> DNA 
<213> Zea sp 

25 <220> 

<221> misc_feature 
<222> (1) . . . (299) 
<223> n = A,T,C or G 

30 <400> 26 

gatggttgca gcatctgcaa ataccctcaa ccaggtgttt gngataaaaa atgatgctaa 60 

aatgaaaagg acaatgcgtg ccccctgcca tctggtcgca ttagtcctgc acatgctgcg 1'2 0 

atgtgggcta caa^t^tt-gg agttgcagga acagctttgt tggcctggaa ggctaatggc _180 

ttggcagctg ggcttgcagc ttctaatctt gttctgtatg catttgtgta tacgccgttg 240 

3 5 aagcaaatac accctgttaa tacatgggtt ggggcagtcg ttggtgccat cccaccact 29 9 

<210> 27 
<211> 255 
<212> DNA 
40 <213> Zea sp 

<220> 

<221> misc_f eature 
<222> (1) . . . (255) 
45 <223> n = A,T,C or G 
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<400> 27 

anacttgcat atctccatgc ntctcaggac aaagangatg acctgaaagt aggtgtcaag 60 

tccacagcat taagatttgg agatttgacc nnatactgna tcagtggctt tggcgcggca 120 

tgcttcggca gcttagcact cagtggttac aatgctgacc ttggttggtg tttagtgtga 180 

5 tgcttgagcg aagaatggta tngtttttac ttgatattga ctccagacct gaaatcatgt 2 40 

tggacagggt ggccc 2 55 

<210> 28 
<211> 257 
10 <212> DNA 

<213> Zea sp 



<400> 28 

attgaagggg ataggactct ggggcttcag tcacttcctg ttgcttttgg gatggaaact 60 

15 gcaaaatgga tttgtgttgg agcaattgat atcactcaat tatctgttgc aggttaccta 120 

ttgagcaccg gtaagctgta ttatgccctg gtgttgcttg ggctaacaat tcctcaggtg 180 

ttctttcagt tccagtactt cctgaaggac cctgtgaagt atgatgtcaa atatcaggca 2 40 

agcgcacaac cattctt 2 57 



20 <210> 29 
<211> 368 
<212> DNA 
<213> Zea sp 



25 <400> 29 

atccagttgc aaataataat ggcgttcttc tctgttgtaa tagcactatt caaggatata 60 

cctgacatcg aaggggaccg catattcggg atccgatcct tcagcgtccg gttagggcaa 120 

aagaaggtct tttggatctg cgttggcttg cttgagatgg cctacagcgt: tgcgatactg 180 

atgggagcta cctcttcctg tttgtggagc aaaacagcaa ccatcgctgg ccattccata 2 40 

3 0 cttgccgcga tcctatggag ctgcgcgcga tcggtggact tgacgagcaa agccgcaata 3 00 

acgtccttct acatgttcat ctggaagctg ttctacgcgg agtacctgct catccctctg 3 60 

gtgcggtg 3 66 



<210> 30 
35 <211> 122 
<212> PRT 
<213> Zea sp 



<400> 30 

40 lie Gin Leu Gin lie lie Met Ala 
1 5 
Phe Lys Asp lie Pro Asp lie Glu 
20 

Ser Phe Ser Val Arg Leu Gly Gin 
45 35 40 

Gly Leu Leu Glu Met Ala Tyr Ser 



Phe Phe Ser Val Val lie Ala Leu 

10 15 
Gly Asp Arg lie Phe Gly lie Arg 
25 30 
Lys Lys Val Phe Trp lie Cys Val 
45 

Val Ala He Leu Met Gly Ala Thr 
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10 



15 



50 55 60 

Ser Ser Cys Leu Trp Ser Lys Thr Ala Thr lie Ala Gly His Ser lie 
65 70 75 80 

Leu Ala Ala lie Leu Trp Ser Cys Ala Arg Ser Val Asp Leu Thr Ser 

85 90 95 

Lys Ala Ala lie Thr Ser Phe Tyr Met Phe lie Trp Lys Leu Phe Tyr 

100 105 110 

Ala Glu Tyr Leu Leu lie Pro Leu Val Arg 
115 120 

<210> 31 
<211> 278 
<212> DNA 
<213> Zea sp 



<400> 31 














tattcagcac 


cacctctcaa 


gctcaagcag 


aatggatgga 


ttgggaactt 


cgctctgggt 


60 


gcgagttaca 


tcagcttgcc 


ctggtgggct 


ggccaggcgt 


tatttggaac 


tcttacacca 


120 


gatatcattg 


tcttgactac 


tttgtacagc 


atagctgggc 


tagggattgc 


tattgtaaat 


180 


gatttcaaga 


gtattgaagg 


ggataggact 


ctggggcttc 


agtcacttcc 


tgttgctttt 


. .240 


gggatggaaa 


ctgcaaaatg 


gatttgtgtt 


ggagcaat 






278 



20 



<210> 32 

<211> 292 

25 <212> PRT 

<213> Synechocystis sp 

<400> 32 

Met Val Ala Gin Thr Pro Ser Ser Pro Pro Leu Trp Leu Thr lie lie 
30 1 5 10 15 

Tyr Leu Leu Arg Trp His Lys Pro Ala Gly Arg Leu He Leu Met He 

20 25 30 

Pro Ala Leu Trp Ala, Val Cys Leu Ala Ala Gin Gly Leu Pro Pro Leu_ 
35 40 45 ^ * 

3 5 Pro Leu Leu Gly Thr He Ala Leu Gly Thr Leu Ala Thr Ser Gly Leu 
50 55 60 

Gly Cys Val Val Asn Asp Leu Trp Asp Arg Asp He Asp Pro Gin Val 
65 70 75 80 

Glu Arg Thr Lys Gin Arg Pro Leu Ala Ala Arg Ala Leu Ser Val Gin 
40 85 90 95 

Val Gly He Gly Val Ala Leu Val Ala Leu Leu Cys Ala Ala Gly Leu 

100 105 110 

Ala Phe Tyr Leu Thr Pro Leu Ser Phe Trp Leu Cys Val Ala Ala Val 
115 120 125 

45 Pro Val He Val Ala Tyr Pro Gly Ala Lys Arg Val Phe Pro Val Pro 
130 135 140 



19 



BNSDOCID: <WO_0063391 A2J_> 



wo 00/63391 



PCT/USOO/10368 



Gin Leu Val Leu 
145 

Ser Ala Val Thr 

Ala Thr Val Phe 
180 

Asp Arg Glu Asp 
195 

Phe Gly Gin Tyr 
210 

lie Gly Cys Leu 
225 

Tyr Trp Leu Ser 

lie Gin Leu Ser 
260 

Phe Gly Gin Asn 
275 

Leu Gly Trp Leu 
290 



Ser lie Ala Trp 
150 

Gly Asp Leu Thr 
165 

Trp Thr Leu Gly 

Asp Arg Arg He 
200 

Val Gly Glu Ala 
215 

Phe Tyz: Leu Gly 
230 

Leu Ala He Ala 
245 

Ala Pro Thr Pro 

Val He He Gly 
280 



Gly Phe Ala Val 
155 

Asp Ala Thr Trp 
170 

Phe Asp Thr Val 
185 

Gly Val Asn Ser 

Val Gly He Phe 
220 

Met He Leu Met 
235 

He Val Gly Trp 
250 

Glu Pro Lys Leu 
265 

Phe Val Leu Leu 



Leu He Ser Trp 
160 

Val Leu Trp Gly 
175 

Tyr Ala Met Ala 
190 

Ser Ala Leu Phe 
205 

Phe Ala Leu Thr 

Leu Asn Pro Leu 
240 

Val He Gin Tyr 
255 

Tyr Gly Gin He 
270 

Ala Gly Met Leu 
285 



<210> 33 
<211> 316 
<212> PRT 

<213> Synechocystis sp 
<400> 33 

Met Val Thr Ser Thr Lys He His Arg Gin His Asp Ser Met Gly Ala 

15 10 15 

Val Cys Lys Ser Tyr Tyr Gin Leu Thr Lys Pro Arg He He Pro Leu 

20 25 30 

Leu Leu He Thr Thr Ala Ala Ser Met Trp He Ala Ser Glu Gly Arg 

35 - . . 40 . _ 

Val Asp Leu Pro Lys Leu Leu He Thr Leu Leu Gly Gly Thr Leu Ala 

50 55 60 

Ala Ala Ser Ala Gin Thr Leu Asn Cys He Tyr Asp Gin Asp He Asp 
65 70 75 80 

Tyr Glu Met Leu Arg Thr Arg Ala Arg Pro He Pro Ala Gly Lys Val 

85 90 95 

Gin Pro Arg His Ala Leu He Phe Ala Leu Ala Leu Gly Val Leu Ser 

100 105 110 

Phe Ala Leu Leu Ala Thr Phe Val Asn Val Leu Ser Gly Cys Leu Ala 

. 115 120 125 

Leu Ser Gly He Val Phe Tyr Met Leu Val Tyr Thr His Trp Leu Lys 

130 135 140 

Arg His Thr Ala Gin Asn He Val He Gly Gly Ala Ala Gly Ser He 
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145 150 155 160 

Pro Pro Leu Val Gly Trp Ala Ala Val Thr Gly Asp Leu Ser Trp Thr 

165 170 175 

Pro Trp Val Leu Phe Ala Leu lie Phe Leu Trp Thr Pro Pro His Phe 

5 180 185 190 

Trp Ala Leu Ala Leu Met lie Lys Asp Asp Tyr Ala Gin Val Asn Val 

195 200 205 

Pro Met Leu Pro Val lie Ala Gly Glu Glu Lys Thr Val Ser Gin lie 
210 215 220 

10 Trp Tyr Tyr Ser Leu Leu Val Val Pro Phe Ser Leu Leu Leu Val Tyr 
225 230 235 240 

Pro Leu His Gin Leu Gly lie Leu Tyr Leu Ala lie Ala lie lie Leu 

245 250 255 

Gly Gly Gin Phe Leu Val Lys Ala Trp Gin Leu Lys Gin Ala Pro Gly 

15 260 265 270 

Asp Arg Asp Leu Ala Arg Gly Leu Phe Lys Phe Ser lie Phe Tyr Leu 

275 280 285 

Met Leu Leu Cys Leu Ala Met Val lie Asp Ser Leu Pro Val Thr His 
290 295 300 

2 0 Gin Leu Val Ala Gin Met Gly Thr Leu Leu Leu Gly 

305 310 315 

<210> 34 

<211> 324 

25 <212> PRT 

<213> Synechocystis sp 

<400> 34 

Met Ser Asp Thr Gin Asn Thr Gly Gin Asn Gin Ala Lys Ala Arg Gin 
30 1 .5 10 . 15 

Leu Leu Gly Met Lys Gly Ala Ala Pro Gly Glu Ser Ser lie Trp Lys 

20 25 30 

lie Arg Leu Gin. Leu Met Lys Pro lie Thr Trp lie Pro Leu lie Trp 
35 ' 40 45 ^ * 

3 5 Gly Val Val Cys Gly Ala Ala Ser Ser Gly Gly Tyr lie Trp Ser Val 

50 55 60 

Glu Asp Phe Leu Lys Ala Leu Thr Cys Met Leu Leu Ser Gly Pro Leu 
65 70 75 80 

Met Thr Gly Tyr Thr Gin Thr Leu Asn Asp Phe Tyr Asp Arg Asp lie 
40 85 90 95 

Asp Ala lie Asn Glu Pro Tyr Arg Pro lie Pro Ser Gly Ala lie Ser 

100 105 110 

Val Pro Gin Val Val Thr Gin lie Leu lie Leu Leu Val Ala Gly lie 
115 120 125 

4 5 Gly Val Ala Tyr Gly Leu Asp Val Trp Ala Gin His Asp Phe Pro lie 

130 135 140 
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Met Met Val Leu Thr Leu Gly Gly Ala Phe Val Ala Tyr lie Tyr Ser 
145 150 155 160 

Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Leu Gly Asn Tyr Ala 
165 170 175 

5 Leu Gly Ala Ser Tyr lie Ala Leu Pro Trp Trp Ala Gly His Ala Leu 
180 185 190 

Phe Gly Thr Leu Asn Pro Thr lie Met Val Leu Thr Leu lie Tyr Ser 

195 200 205 

Leu Ala Gly Leu Gly lie Ala Val Val Asn Asp Phe Lys Ser Val Glu 
10 210 215 220 

Gly Asp Arg Gin Leu Gly Leu Lys Ser Leu Pro Val Met Phe Gly lie 
225 230 235 240 

Gly Thr Ala Ala Trp lie Cys Val lie Met lie Asp Val Phe Gin Ala 
245 250 255 

15 Gly lie Ala Gly Tyr Leu lie Tyr Val His Gin Gin Leu Tyr Ala Thr 
260 265 270 

lie Val Leu Leu Leu Leu lie Pro Gin lie Thr Phe Gin Asp Met Tyr 

275 280 285 

Phe Leu Arg Asn Pro Leu Glu Asn Asp Val Lys Tyr Gin Ala Ser Ala 
20 290 295 300 

Gin Pro Phe Leu Val Phe Gly Met Leu Ala Thr Gly Leu Ala Leu Gly 
305 310 315 320 

His Ala Gly lie 



25 



30 



<210> 35 
<211> 307 
<212> PRT 

<213> Synechocystis sp 



<400> 35 

Met Thr Glu Ser Ser Pro Leu Ala Pro Ser Thr Ala Pro Ala Thr Arg 
1 „ ^ . 5 10 _ 15 _ 

Lys Leu Trp Leu Ala Ala lie Lys Pro Pro Met Tyr Thr Val Ala Val 
35 20 25 30 

Val Pro lie Thr Val Gly Ser Ala Val Ala Tyr Gly Leu Thr Gly Gin 

35 40 45 

Trp His Gly Asp Val Phe Thr lie Phe Leu Leu Ser Ala lie Ala lie 
50 55 60 

40 lie Ala Trp lie Asn Leu Ser Asn Asp Val Phe Asp Ser Asp Thr Gly 
65 70 75 80 

lie Asp Val Arg Lys Ala His Ser Val Val Asn Leu Thr Gly Asn Arg 

85 90 95 

Asn Leu Val Phe Leu lie Ser Asn Phe Phe Leu Leu Ala Gly Val Leu 
45 100 105 110 

Gly Leu Met Ser Met Ser Trp Arg Ala Gin Asp Trp Thr Val Leu Glu 
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115 120 125 

Leu lie Gly Val Ala lie Phe Leu Gly Tyr Thr Tyr Gin Gly Pro Pro 

130 135 140 

Phe Arg Leu Gly Tyr Leu Gly Leu Gly Glu Leu lie Cys Leu lie Thr 
5 145 150 155 160 

Phe Gly Pro Leu Ala lie Ala Ala Ala Tyr Tyr Ser Gin Ser Gin Ser 

165 170 175 

Phe Ser Trp Asn Leu Leu Thr Pro Ser Val Phe Val Gly lie Ser Thr 
180 185 190 

10 Ala lie lie Leu Phe Cys Ser His Phe His Gin Val Glu Asp Asp Leu 
195 200 205 

Ala Ala Gly Lys Lys Ser Pro lie Val Arg Leu Gly Thr Lys Leu Gly 

210 215 220 

Ser Gin Val Leu Thr Leu Ser Val Val Ser Leu Tyr Leu lie Thr Ala 
15 225 230 235 240 

He Gly Val Leu Cys His Gin Ala Pro Trp Gin Thr Leu Leu He He 

245 250 255 

Ala Ser Leu Pro Trp Ala Val Gin Leu He Arg His Val Gly Gin Tyr 
260 265 270 

2 0 His Asp Gin Pro Glu Gin Val Ser Asn Cys Lys Phe He Ala Val Asn 

275 280 285 

Leu His Phe Phe Ser Gly Met Leu Met Ala Ala Gly Tyr Gly Trp Ala 

290 295 300 

Gly Leu Gly 
25 305 

<210> 36 
<211> 927 
<212> DNA 

3 0 <213> Synechocystis sp 

<400> 36 

atggcaacta tccaagcttt ttggcgcttc tcccgccccc ataccatcat tggtacaact . 60 

ctgagcgtct gggctgtgta tctgttaact attctcgggg atggaaactc agttaactcc ' 12 0 

3 5 cctgcttccc tggatttagt gttcggcgct tggctggcct gcctgttggg taatgtgtac 180 

attgtcggcc tcaaccaatt gtgggatgtg gacattgacc gcatcaataa gccgaatttg 240 

cccctagcta acggagattt ttctatcgcc cagggccgtt ggattgtggg actttgtggc 3 00 

gttgcttcct tggcgatcgc ctggggatta gggctatggc tggggctaac ggtgggcatt 3 60 

agtttgatta ttggcacggc ctattcggtg ccgccagtga ggttaaagcg cttttccctg 420 

4 0 ctggcggccc tgtgtattct gacggtgcgg ggaattgtgg ttaacttggg cttattttta 480 

ttttttagaa ttggtttagg ttatcccccc actttaataa cccccatctg ggttttgact 540 

ttatttatct tagttttcac cgtggcgatc gccattttta aagatgtgcc agatatggaa 600 

ggcgatcggc aatttaagat tcaaacttta actttgcaaa tcggcaaaca aaacgttttt 660 

cggggaacct taattttacc cactggttgt tatttagcca tggcaatctg gggcttatgg 720 

45 gcggctatgc ctttaaatac tgctttcttg attgtttccc atttgtgctt attagcctta 780 

ctctggtggc ggagtcgaga tgtacactta gaaagcaaaa ccgaaattgc tagtttttat 840 
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cagtttattt ggaagctatt tttcttagag tacttgctgt atcccttggc tctgtggtta 900 
cctaattttt ctaatactat tttttag 927 

<210> 37 
5 <211> 308 
<212> PRT 

<213> Synechocystis sp 
<400> 37 

10 Met Ala Thr lie Gin Ala Phe Trp Arg Phe Ser Arg Pro His Thr lie 
1 5 10 15 

lie Gly Thr Thr Leu Ser Val Trp Ala Val Tyr Leu Leu Thr lie Leu 

20 25 30 

Gly Asp Gly Asn Ser Val Asn Ser Pro Ala Ser Leu Asp Leu Val Phe 
15 35 40 45 

Gly Ala Trp Leu Ala Cys Leu Leu Gly Asn Val Tyr lie Val Gly Leu 

50 55 60 

Asn Gin Leu Trp Asp Val Asp lie Asp Arg lie Asn Lys Pro Asn Leu 
65 70 75 80 

20 Pro Leu Ala Asn Gly Asp Phe Ser lie Ala Gin Gly Arg Trp lie Val 

85 90 95 

Gly Leu Cys Gly Val Ala Ser Leu Ala lie Ala Trp Gly Leu Gly Leu 

100 105 110 

Trp Leu Gly Leu Thr Val Gly lie Ser Leu lie lie Gly Thr Ala Tyr 
25 115 120 125 

Ser Val Pro Pro Val Arg Leu Lys Arg Phe Ser Leu Leu Ala Ala Leu 

130 135 140 

Cys lie Leu Thr Val Arg Gly lie Val Val Asn Leu Gly Leu Phe Leu 
145 150 155 160 

3 0 Phe Phe Arg lie Gly Leu Gly Tyr Pro Pro Thr Leu lie Thr Pro He 

165 170 175 

Trp Val Leu Thr Leu Phe He Leu Val Phe Thr Val Ala He Ala He ' / 

180. , . 185 190 _ . ' 

Phe Lys Asp Val Pro Asp Met Glu Gly Asp Arg Gin Phe Lys He Gin 
35 195 200 205 

Thr Leu Thr Leu Gin He Gly Lys Gin Asn Val Phe Arg Gly Thr Leu 

210 215 220 

He Leu Leu Thr Gly Cys Tyr Leu Ala Met Ala He Trp Gly Leu Trp 
225 230 235 240 

40 Ala Ala Met Pro Leu Asn Thr Ala Phe Leu He Val Ser His Leu Cys 

245 250 255 

Leu Leu Ala Leu Leu Trp Trp Arg Ser Arg Asp Val His Leu Glu Ser 

260 265 270 

Lys Thr Glu He Ala Ser Phe Tyr Gin Phe He Trp Lys Leu Phe Phe 
45 275 280 285 

Leu Glu Tyr Leu Leu Tyr Pro Leu Ala Leu Trp Leu Pro Asn Phe Ser 
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290 295 300 
Asn Thr lie Phe 
305 

5 <210> 38 

<211> 1092 
<212> DNA 

<213> Synechocystis sp 
10 <400> 38 

atgaaatttc cgccccacag tggttaccat tggcaaggtc aatcaccttt ctttgaaggt 60 

tggtacgtgc gcctgctttt gccccaatcc ggggaaagtt ttgcttttat gtactccatc 12 0 

gaaaatcctg ctagcgatca tcattacggc ggcggtgctg tgcaaatttt agggccggct 180 

acgaaaaaac aagaaaatca ggaagaccaa cttgtttggc ggacatttcc ctcggtaaaa 240 

15 aaattttggg ccagtcctcg ccagtttgcc ctagggcatt ggggaaaatg tagggataac 300 

aggcaggcga aacccctact ctccgaagaa ttttttgcca cggtcaagga aggttatcaa 3 60 

atccatcaaa atcagcacca aggacaaatc attcatggcg atcgccattg tcgttggcag 420 

ttcaccgtag aaccggaagt aacttggggg agtcctaacc gatttcctcg ggctacagcg 4 80 

ggttggcttt cctttttacc cttgtttgat cccggttggc aaattctttt agcccaaggt 540 

2 0 agagcgcacg gctggctgaa atggcagagg gaacagtatg aatttgacca cgccctagtt 600 

tatgccgaaa aaaattgggg tcactccttt ccctcccgct ggttttggct. ccaagcaaat 660 

tattttcctg accatccagg actgagcgtc actgccgctg gcggggaacg gattgttctt 720 

ggtcgccccg aagaggtagc tttaattggc ttacatcacc aaggtaattt ttacgaattt 780 

ggcccgggcc atggcacagt: cacttggcaa gtagctccct ggggccgttg gcaattaaaa 840 

25 gccagcaatg ataggtattg ggtcaagttg tccggaaaaa cagataaaaa aggcagttta 900 

gtccacactc ccaccgccca gggcttacaa ctcaactgcc gagataccac taggggctat 960 

ttgtatttgc aattgggatc tgtgggtcac ggcctgatag tgcaagggga aacggacacc 1020 

gcggggctag aagttggagg tgattggggt ttaacagagg aaaatttgag caaaaaaaca 1080 

gtgccattct ga 1092 



30 



35 



<210> 39 

<211> 363 

<212> PRT _ - . - 

<213> Synechocystis sp 



<400> 39 

Met Lys Phe Pro Pro His Ser Gly Tyr His Trp Gin Gly Gin Ser Pro 

1 5 10 15 

Phe Phe Glu Gly Trp Tyr Val Arg Leu Leu Leu Pro Gin Ser Gly Glu 
40 20 25 30 

Ser Phe Ala Phe Met Tyr Ser lie Glu Asn Pro Ala Ser Asp His His 

35 40 45 

Tyr Gly Gly Gly Ala Val Gin lie Leu Gly Pro Ala Thr Lys Lys Gin 
50 55 60 

45 Glu Asn Gin Glu Asp Gin Leu Val Trp Arg Thr Phe Pro Ser Val Lys 
65 70 75 80 
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Lys Phe Trp Ala Ser Pro Arg Gin Phe Ala Leu Gly His Trp Gly Lys 

85 90 95 

Cys Arg Asp Asn Arg Gin Ala Lys Pro Leu Leu Ser Glu Glu Phe Phe 
100 105 110 

5 Ala Thr Val Lys Glu Gly Tyr Gin lie His Gin Asn Gin His Gin Gly 
115 120 125 

Gin lie lie His Gly Asp Arg His Cys Arg Trp Gin Phe Thr Val Glu 

130 135 140 

Pro Glu Val Thr Trp Gly Ser Pro Asn Arg Phe Pro Arg Ala Thr Ala 
10 145 150 155 160 

Gly Trp Leu Ser Phe Leu Pro Leu Phe Asp Pro Gly Trp Gin lie Leu 

165 170 175 

Leu Ala Gin Gly Arg Ala His Gly Trp Leu Lys Trp Gin Arg Glu Gin 
180 185 190 

15 Tyr Glu Phe Asp His Ala Leu Val Tyr Ala Glu Lys Asn Trp Gly His 
195 200 205 

Ser Phe Pro Ser Arg Trp Phe Trp Leu Gin Ala Asn Tyr Phe Pro Asp 

210 215 220 

His Pro Gly Leu Ser Val Thr Ala Ala Gly Gly Glu Arg lie Val Leu 
20 225 230 235 240 

Gly Arg Pro Glu Glu Val Ala Leu lie Gly Leu His His Gin Gly Asn 

245 250 255 

Phe Tyr Glu Phe Gly Pro Gly His Gly Thr Val Thr Trp Gin Val Ala 
260 265 270 

2 5 Pro Trp Gly Arg Trp Gin Leu Lys Ala Ser Asn Asp Arg Tyr Trp Val 
275 280 285 

Lys Leu Ser Gly Lys Thr Asp Lys Lys Gly Ser Leu Val His Thr Pro 

290 295 300 

Thr Ala Gin Gly Leu Gin Leu Asn Cys Arg Asp Thr Thr Arg Gly Tyr 
30 305 310 315 320 

Leu Tyr Leu Gin Leu Gly Ser Val Gly His Gly Leu He Val Gin Gly 

325 330 335 

Glu Thr Asp Thr Ala Gly Leu Glu Val Gly Gly Asp Trp Gly Leu Thr 
340 345 350 

35 Glu Glu Asn Leu Ser Lys Lys Thr Val Pro Phe 
355 360 

<210> 40 
<211> 56 
40 <212> DNA 

<213> Artifical Sequence 

<400> 40 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat tcaaat 56 

45 

<210> 41 
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<211> 32 
<212> DNA 

<213> Artifical Sequence 
5 <400> 41 

tcgaggatcc gcggccgcaa gcttcctgca gg 32 

<210> 42 

<211> 32 

10 <212> DNA 

<213> Artifical Sequence 

<400> 42 

tcgacctgca ggaagcttgc ggccgcggat cc 32 

15 

<210> 43 
<211> 32 
<212> DNA 

<213> Artifical Sequence 

20 

<400> 43 

tcgacctgca ggaagcttgc ggccgcggat cc 3 2 

<210> 44 

25 <211> 32 

<212> DNA 

<213> Artifical Sequence 

<400> 44 

3 0 tcgaggatcc gcggccgcaa gcttcctgca gg 3 2 

<210> 45 

<211> 36 _ ^ - - - ^ ^ 

<212> DNA 

35 <213> Artifical Sequence 

<400> 45 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 3 6 

40 <210> 46 
<211> 28 
<212> DNA 

<213> Artifical Sequence 
45 <400> 46 

cctgcaggaa gcttgcggcc gcggatcc 2 8 
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<210> 47 
<211> 36 
<212> DNA 
5 <213> Artifical Sequence 

<400> 47 

tcgacctgca ggaagcttgc ggccgcggat ccagct 

10 <210> 48 
<211> 28 
<212> DNA 

<213> Artifical Sequence 

15 <400> 48 

ggatccgcgg ccgcaagctt cctgcagg 

<210> 49 
<211> 39 
20 <212> DNA 

<213> Artifical Sequence 

<400> 49 

gatcacctgc aggaagcttg cggccgcgga tccaatgca 



25 



30 



<210> 50 
<211> 31 
<212> DNA 

<213> Artifical Sequence 
<400> 50 

ttggatccgc ggccgcaagc ttcctgcagg t 



<210> 51 
35 <211> 41 
<212> DNA 

<213> Artifical Sequence 
<400> 51 

40 ggatccgcgg ccgcacaatg gagtctctgc tctctagttc t 

<210> 52 
<211> 38 
<212> DNA 
45 <213> Artifical Sequence 
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<400> 52 

ggatcctgca ggtcacttca aaaaaggtaa cagcaagt 3 8 

<210> 53 
5 <211> 45 
<212> DNA 

<213> Artifical Sequence 
<400> 53 

10 ggatccgcgg ccgcacaatg gcgttttttg ggctctcccg tgttt 45 

<210> 54 
<211> 40 
<212> DNA 
15 <213> Artifical Secjuence 

<400> 54 

ggatcctgca ggttattgaa aacttcttcc aagtacaact 40 

20 <210> 55 
<211> 38 
<212> DNA 

<213> Artifical Sequence 
25 <400> 55 

ggatccgcgg ccgcacaatg tggcgaagat ctgttgtt 3 8 

<210> 56 
<211> 37 
3 0 <212> DNA 

<213> Artifical Sequence 

<400> 56 _ , . _ . 

ggatcctgca ggtcatggag agtagaagga aggagct 3 7 



35 



40 



<210> 57 
<211> 50 
<212> DNA 

<213> Artifical Sequence 
<400> 57 

ggatccgcgg ccgcacaatg gtacttgccg aggttccaaa gcttgcctcc 50 



<210> 58 
45 <211> 38 
<212> DNA 
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<400> 58 

ggatcctgca ggccacttgt ttctggtgat 

5 

<210> 59 
<211> 38 
<212> DNA 

<213> Artifical Sequence 

10 

<400> 59 

ggatccgcgg ccgcacaatg acttcgattc 

<210> 60 
15 <211> 36 
<212> DNA 

<213> Artifical Sequence 
<400> 60 

20 ggatcctgca ggtcagtgtt gcgatgctaa 

<210> 61 
<211> 22 
<212> DNA 
25 <213> Artifical Sequence 

<400> 61 

taatgtgtac attgtcggcc tc 

30 <210> 62 
<211> 60 
<212> DNA 

<213> ArtifiQal- Sequence 

35 <400> 62 

gcaatgtaac atcagagatt ttgagacaca 

<210> 63 
<211> 22 
40 <212> DNA 

<213> Artifical Sequence 

<400> 63 

aggctaataa gcacaaatgg ga 

45 

<210> 64 



PCT/USOO/10368 



gactctat 



tcaacact 



tgccgt 



acgtggcttt ccacaattcc ccgcaccgtc 



30 



BNSDOCID: <WO__0063391 A2J_> 



wo 00/63391 



^ PCT/USOO/10368 



<211> 63 
<212> DNA 

<213> Artifical Sequence 
5 <400> 64 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggaattgg tttaggttat 60 
ccc 63 

<210> 65 
10 <211> 26 
<212> DNA 

<213> Artifical Sequence 
<400> 65 

15 ggatccatgg ttgcccaaac cccatc 26 

<210> 66 

<211> 61 

<212> DNA 

20 <213> Artifical Sequence 

<400> 66 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gggtaagcaa caatgaccgg 60 
c 61 

25 

<210> 67 
<211> 25 
<212> DNA 

<213> Artifical Sequence 

30 

<400> 67 

gaattctcaa agccagccca gtaac 'i2 5 

<210> 68 * 
35 . <211> 63 
<212> DNA 

<213> Artifical Sequence 
<400> 68 

40 ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgggtgcga aaagggtttt 6 0 

ccc 63 

<210> 69 
<211> 23 
45 <212> DNA 

<213> Artifical Sequence 
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ccagtggttt aggctgtgtg gtc 

5 <210> 70 

<211> 21 

<212> DNA 

<213> Artifical Sequence 

10 <400> 70 

ctgagttgga tgtattggat c 

<210> 71 

<211> 28 

15 <212> DNA 

<213> Artifical Sequence 

<400> 71 

ggatccatgg ttacttcgac aaaaatcc 

20 

<210> 72 
<211> 60 
<212> DNA 

<213> Artifical Sequence 

25 

<400> 72 

gcaatgtaac atcagagatt ttgagacaca 

<210> 73 
30 <211> 28 
<212> DNA 

<213> Artifical Sequence 
<400> 73 

35 gaattcttaa cccaacagta aagttccc 

<210> 74 
<211> 63 
<212> DNA 
40 <213> Artifical Sequence 

<400> 74 

ggtatgagtc agcaacacct tcttcacgag 
• atg 

45 

<210> 75 



PCT/USOO/10368 



23 



21 



28 



acgtggcttt gctaggcaac cgcttagtac 60 



.28 



gcagacctca gcgccggcat tgtcttttac 60 

63 
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<211> 20 
<212> DNA 

<213> Artifical Sequence 

5 <400> 75 

ggaacccttg cagccgcttc 

<210> 76 

<211> 22 

10 <212> DNA 

<213> Artifical Secfuence 

<400> 76 

gtatgcccaa ctggtgcaga gg 

15 



20 



<210> 77 
<211> 28 
<212> DNA 

<213> Artifical Secjuence 
<400> 77 

ggatccatgt ctgacacaca aaataccg 

<210> 78 
25 <211> 62 
<212> DNA 

<213> Artifical Sequence 
<400> 78 

3 0 gcaatgtaac atcagagatt ttgagacaca acgtggcttt cgccaatacc agccaccaac 
ag 

<210> 79 _ ^ . ^ 

<211> 27 
3 5 <212> DNA 

<213> Artifical Sequence 

<400> 79 

t 

gaattctcaa atccccgcat ggcctag 



40 



45 



<210> 80 
<211> 65 
<212> DNA 

<213> Artifical Sequence 
<400> 80 
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ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggcccacg gcttggacgt 60 
gtggg 65 

<210> 81 
5 <211> 21 
<212> DNA 

<213> Artifical Sequence 
<400> 81 

10 cacttggatt cccctgatct g 21 

<210> 82 
<211> 21 
<212> DNA 
15 <213> Artifical Sequence 

<400> 82 

gcaatacccg cttggaaaac g 21 

20 <210> 83 
<211> 29 
<212> DNA 

<213> Artifical Sequence 
25 <400> 83 

ggatccatga ccgaatcttc gcccctagc 29 

<210> 84 
<211> 61 
3 0 <212> DNA 

<213> Artifical Sequence 

'• I 

<400> 84 . 

gcaatgtaac atcagagatt ctgagacaca acgtggcttt caatcctagg tagccgaggc 60 
^ .61 
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<210> 85 

<211> 27 

<212> DNA 

40 <213> Artifical Sequence 

<400> 85 

gaattcttag cccaggccag cccagcc 27 

45 <210> 86 

<211> 66 
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<212> DNA 

<213> Artifical Sequence 
<400> 86 

5 ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggggaatt gatttgttta 
attacc 

<210> 87 

<211> 21 

10 <212> DNA 

<213> Artifical Sequence 

<400> 87 

gcgatcgcca ttatcgcttg g 

15 

<210> 88 
<211> 24 
<212> DNA 

<213> Artifical Sequence 

20 

<400> 88 

gcagactggc aattatcagt aacg 

<210> 89 
25 <211> 25 
<212> DNA 

<213> Artifical Sequence 
<400> 89 

3 0 ccatggattc gagtaaagtt gtcgc 
<210> 90 

<211> 0 - . ; - , 

<213> Artifical Sequence 

35 

<400> 90 

gaattcactt caaaaaaggt aacag 

<210> 91 
40 <211> 4550 
<212> DNA 

<213> Arabidopsis sp 
<400> 91 

45 attttacacc aatttgatca cttaactaaa ttaattaaat tagatgatta tcccaccata 
tttttgagca ttaaaccata aaaccatagt tataagtaac tgttttaatc gaatatgact 
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cgattaagat taggaaaaat ttataaccgg 
taaatgccga ttcctccctt gtctaaaaga 
tgtttcactc tatttaattt caggcacaat 
caacacgtga tacttttcct cgtccgtcag 
5 caaatctaca ccacattttt tgcttaatct 
agtctaacta attcttctaa tataagtaca 
taattttcaa aatctaatct aaatatctaa 
aatgacacca attaatcatc ctcgacccac 
ttttttgctc tctgttcctt caaaatcatt 

10 ttctttgtct ttgatttttg attttttttc 
atggagtctc tgctctctag ttcttctctt 
gtttcaggtt ttatttgttg tttaggtttc 
ttgaactttt ctgaatataa aataaggaaa 
tagatcgaag taggtgacaa aggttattgt 

15 tgaattttgt ttctcatgca tgcaacttat 
cagaatctaa agctccactc tttatcaggt 
aatactcaat catcttagtc tcattattct 
tttatgagac aatgtatgtt ggacttagtt 
gttactgatg ttgtttagct ctttacacca 

20 gttctgcgtt gtgattcgag taaagttgtc 
aggcctgatg gtcaaggatc ttcattgttg 
gttaatgcca ctgcgggtca gcctgaggct 
agagactcgt tagatgcgtt ttacaggttt 
aagtttctct ttaaaaatgt aactctttta 

2 5 taacattagc tctgtgattg gatttgcagg 

cagtagagaa ggtttctgat atatctcctt 
atatataaca cataatgacc gatgaagaag 
attgggtttt gttttcaggc tgttgttgca 
ctaaaccagt tgtctgatgt tgaaatagat 

3 0 ttcgagagac tgatgagatt aatagcagct 

gcaggttaac aagccctatc ttccattggc 
tgcaatagta gcttccttct ccatcatggt 
tagaattcta taagttactg aaatagtttg 
gtggattgtt ggttcatggc cattgttctg 

3 5 tgcatactct atcaatgtaa gtaagtttct 
tgcagtttct agttttaggt taatgaggtt 
ccacttttac ggtggaaaag atttgcattg 
gctattattg ttcaaatcgc cttttatcta 
gttttgtagt tgttttcatc aaaatcactt 

40 cagacacatg tgtttggaag accaatcttg 
tttatgagct ttttctctgt cgttattgca 
ttaaatctat gtatacttaa agtaaagcat 
ttggttggat gcaggatata cctgatatcg 
tctctgtaac tctgggticag aaacgggtac 

45 caagtgtcgg attaagatta cagaagaaag 
gtgttttgga catgtgttac actacttcaa 



taattaagaa aacattaacc gtagtaaccg 18.0 

cagaaaacat atattttatt ttgccccata 240 

acttttggtt ggtaacaaaa ctaaaaagga 3 00 

tcagattttt tttaaactag aaacaagtgg 360 

attaacttgt aagttttaaa ttcctaaaaa 42 0 

ttccctaaat ttcccaaaaa gtcaaattaa 480 

taattcaaaa tcattaaaaa gacacgcaac 540 

acaattctac agttctcatg ctaaaccata 600 

tctttctctt ctttgattcc caaagatcac 660 

tctctggcgt gaaggaagaa gctttatttc 72 0 

gtttccgctg gtaaatctcg tccttttctg 780 

gtttttgtga ttcagaacca tacaaaaagt 840 

aagtttcgat ttttataatg aattgtttac 900 

gtggagaagc ataatttctg ggcttgactt 960 

caatcagctg . gtgggttttg ttggaagaag 1020 

tcgttagggt tttatgggtt tttgaaatta 1080 

attggttgaa tcacattttc taatttggaa 1140 

gaagttcttc tctttggtta tagttgaagt 12 00 

atatatacac ccaattttgc agaaatccga 1260 

gcaaaaccga agtttaggaa caatcttgtt . 1320 

ttgtatccaa aacataagtc gagatttcgg 13 80 

ttcgactcga atagcaaaca gaagtctttt 1440 

tctaggcctc atacagttat tggcacagtt 1500 

aaacgcaatc tttcagggtt ttcaaggaga 1560 

tgcttagcat tttatctgta tctttcttag 1620 

tacttttcac tggcatcttg gaggtaatga 1680 

atacattttt ttcgtctctc tgtttaaaca .1740 

gctctcatga tgaacattta catagttggg 1800 

aaggtaacat gcaaattttc ttcatatgag 1860 

agtgcGtaga tcatctctat gtgggttttt 1920 

atcaggagaa tattctgtta acaccggcat 1980 

atggtgccat tttcacaaaa tttcaacttt 2640 

ttataaatcg ttatagagtt tctggcttgg . 2100 

ggctcttttt gtgagtttca tgctcggtac 2160 

caatactaga atttggctca aatcaaaatc 2220 

ttaataactt acttctacta caaacagttg 2280 

gttgcagcaa tgtgtatcct cgctgtccga 2340 

catattcagg tactaaacca ttttccttat 2400 

ttatattact aaagctgtga aactttgttg 2460 

ttcactaggc ctcttatttt cgccactgcg 2520 

ttgtttaagg taaacaaaga tggaaaaaga 2580 

tctactgtta ttgatgagaa gttttctttt 2640 

aaggggataa gatattcgga atccgatcat 2700 

gatatctaaa ctaaagaaat tgttttgact 2760 

aaaactgttt ttgtttcttg caaaattcag 2820 

atggcttacg ctgttgcaat tctagttgga 2880 
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gccacatctc cattcatatg gagcaaagtc atctcggtaa caatctttct ttacccatcg 2940 

aaaactcgct aattcatcgt ttgagtggta ctggtttcat tttgttccgt tctgttgatt 3000 

ttttttcagg ttgtgggtca tgttatactc gcaacaactt tgtgggctcg agctaagtcc 3060 

gttgatctga gtagcaaaac cgaaataact tcatgttata tgttcatatg gaaggttaga 3120 

5 ttcgtttata aatagagtct ttactgcctt tttatgcgct ccaatttgga attaaaatag 3180 

cctttcagtt tcatcgaatc accattatac tgataaattc tcatttctgc atcagctctt 3240 

ttatgcagag tacttgctgt tacctttttt gaagtgactg acattagaag agaagaagat 3 3 00 

ggagataaaa gaataagtca tcactatgct tctgttttta ttacaagttc atgaaattag 3 3 60 

gtagtgaact agtgaattag agttttattc tgaaacatgg cagactgcaa aaatatgtca 3 420 

10 aagatatgaa tttctgttgg gtaaagaagt ctctgcttgg gcaaaatctt aaggttcggt 3480 

gtgttgatat aatgctaagc gaagaaatcg attctatgta gaaatttccg aaactatgtg 3540 

taaacatgtc agaacatctc cattctatat cttcttctgc aagaaagctc tgtttttatc 3600 

acctaaactc tttatctctg tgtagttaag atatgtatat gtacgtgact acattttttt 3660 

gttgatgtaa tttgcagaac gtatggattt ttgttagaaa gcatgagttc gaaagtatat 3720 

15 gtttatatat atggataatt cagacctaac gtcgaagctc acaagcataa attcactact 3780 

atagtttgct ctgtaataga tagttccatt gatgtcttga aactgtacgt aactgcctgg 3 840 

gcgttttgtg gttgatactg actactgagt gttctttgtg agtgttgtaa gtatacaaga 3 900 

agaagaatat aggctcacgg gaacgactgt ggtggaagat gaaatggaga tcatcacgta 3960 

gcggctttgc caaagaccga gtcacgatcg agtctatgaa gtctttacag ctgctgatta 4020 

2 0 tgattgacca ttgcttagag acgcattgga atcttactag ggacttgcct gggagtttct .4080 

tcaagtacgt gtcagatcat acgatgtagg agatttcacg gctttgatgt gtttgtttgg 4140 

agtcacaatg cttaatgggc ttattggccc aataatagct agctcttttg ctttagccgt 4200 

ttcgtttgtc ccctggtggt gagtattatt agggtatggt gtgaccaaag tcaccagacc 42 60 

tagagtgaat ctagtagagt cctagaccat ggtccatggc ttttatttgt aatttgaaaa 432 0 

25 atgaacaatt ctttttgtaa ggaaaacttt tatatagtag acgtttacta tatagaaact 43 80 

agttgaacta acttcgtgca atcgcataat aatggtgtga aatagagggt gcaaaactca 4440 

ataaacattt cgacgtacca agagttcgaa acaataagca aaatagattt ttttgcttca 4500 

gactaatttg tacaatgaat ggttaataaa ccattgaagc ttttattaat 4550 



30 <210> 92 

<211>- 4450 
<212> DNA 

<213> Arabidopsis .sp 



35 <400> 92 

tttaggttac aaaatcaatg atattgcgta tgtcaactat aaaagccaaa agtaaagcct 60 

cttgtttgac cagaaggtca tgatcattgt atacatacag ccaaactacc tcctggaaga 12 0 

aaagacatgg atcccaaaca acaacaatag cttcttttac aagaaccagt agtaactagt 180 

cactaatcta aaagagttaa gtttcagctt ttctggcaat ggctccttga tcatttcaat 240 

40 cctgaaggag acccactttg tagcaagacc atgtcctctg tttcacttac agtgtgtctc 3 00 

aaaagtctac ttcaattcnt catatatagg ttcctcacac tacagcttca tcctcattcg 3 60 

ttgacagaga gagagtctt t attgaaaact tcttccaagt acaactccac taaatataat 42 0 

agcaccaaac cacttgttcg acacaaatct gtacagatat aaaaacacta ttaggttttc 480 

caaggcaaat cacataattg gattgtgaaa gagtacaaaa gataaaccca aattttcata 54 0 

45 ctttctactg cagtcagcac cagatgataa gtcagctgtc cctatttgcc atcctaactg 600 

tcctgatgca gcggccagtg atgcgtaata ttgccaccct taatcattag agcgagaaac 660 
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aaaaagaatc aaaagacagt aaatggaatt 
attgagcacc gagatctgca ctgaatccag 
caaatccagt taaccaaagc tttgtattat 
caacttttac atcatcttct ttgtcctgga 
5 ' aaaaaaatga tttaacctag aatatctcaa 
gaaattttgg gttcgtagct tgtggcatat 
aactttcttt tctcacttct gttgcaaacg 
taaagtatag aaatcagatg gaaaaggtgg 
gacaaaagtc gaacatcgaa atggatgcat 

10 agaaatctgt ggtggtgaag ctagaaaaag 
gattaactac tttgctactg gtcataatca 
gaatatacct gatgtgcata aatagtatca 
tagagaggga gtacaataga tggtgctatg 
gctccccagt ttatggtcaa acctaaaaag 

15 atcataagaa aatcagaaaa tatataatgt 
ttacccaaaa tgtaaacctc ttcataagtg 
cccctaaaac acggctgcag aatatacata 
atcacaaaac taaagacaag acctgagaac 
gggtgaacca tatgtgtatg tgaattttta 

2 0 gcaagtaaaa aatccaaaca aacctgtaat 
aaagcaactg cagcccgaga aatccaatcc 
taggtcttag ttttgtacga tcaacctgga 
taaaacaaaa caaccataca aaatcttgag 
tggaagaatg aatccagtta catgaatgct 

2 5 ttcaatcgaa aaacatattc caccttcacc 

gaacgaagtc atcagaacat gcagataagc 
tgttgtcgta aattgatcca acatagaaaa 
aacactttcc caccatggtt acagaaacca 
catactaaag ggatatataa atttgacatc 

3 0 aaacaaactg acctttgtat ctatgtcctg 

acctctaaga agtaatgctc cgcaaccaaa 
tccaggatca gcagccaacg caatcgacct 
atctatttac atagetctgg aactagatcc 
ctaaagactt ccaaacagat tcctgagtaa 

3 5 tatataaaat caaagaaaac tcaggtttat 

cttaaccact ctcccatgct atcaaaaacc 
atgagctctt gggaagatca ttatggattt 
agactgcaag aactactcca aacttctcca 
cagacataaa ttcttttatc aagcttcaag 

4 0 accaaccagg aaaacacata actttatcac 

acataaacca tcctttggga cgaaaggaaa 
agctattctc tcggatggat tataatgaat 
acattactca aaggcgaaga taaact'tacc 
caatgggttt atccaatcga gcaagcttag 
45 aatctatcca agaagcttcc ttaacaacaa 
tcggctttcc ctccaaaacc gaagaagacg 



aggaatcaca 


aatgagtcct 


tgtaaagttt 


720 


aaagtgcaag 


aaaacctatg 


gatgctgtgc 


780 


caccgaatc t 


aagggctgtt 


gacttaacac 




gacacaatat 


attagacatt 


agtccatgga 


d rv 
y U U 


aattacttgc 


ataaaaactg 


aacttgagct 


c\ c r\ 

you 


actatttcat 


tttcaatggg 


ccacaaaggt 




ggaagacttt 


tatggggcta 


actcttcact 


T A o n 
XU o U 


gagatcaggg 


taattttctt 


ctttatgatt 


114 U 


ttgcatgaga 


catgaaacaa 


aagctgaaaa 




aaaacaaagc 


aagcaatatg cacacattga 


12 D U 


aatagatttt 


gaagctaaaa 


aataaaaagt 


1 j2 U 


taaacaaggg 


tccagcagac 


tccggagaga 


T T o r\ 
1 J o (J 


cttcctttaa 


ctgcagtcca 


tcctaacaat 


144 0 


gcttgaggct 


gcaattataa 


aaacgaatca 


150 0 


ctaactttga 


gaagccagaa 


tagatttaaa 


1560 


ggtaggaaaa 


gacaagtaac 


aaagatgaag 


1620 


ctgaaatgag 


ctcaagtaga 


aaagaatttg 


1680 


atatcttcag 


aatttgggcc 


aactacataa 


1740 


aacaaacact 


tgcaaatacg 


cgactttagg 


1800 


tgttaagttg 


gagaagaatc 


cctaagccta 


. 1860 


cttgaaatgg 


tgtcaaaaga 


ccactggcga 


1920 


tataaaagaa 


atttgtaaga 


caacataatc 


1980 


ctttacatac 


aagcaaccca 


tctttgttta 


2040 


gtgtatctac 


cctaactac t 


aaacacatat 


2100 


atatctaaca 


cctgaagtc t 


cccacuct-ut. 


2160 


tattacccaa 


aacagagata 


tgac tggaaa 


222 0 


atcaagacca 


gt tccaga tg 


tcaaagcaa t 


2280 


tagttacaca 


s 3 3 2 +~ /~T 'i~ i~ f— 

aa.a.Cct L.y u u u 




23 40 


actttatcac 


>— ^ k ^ ^ -"1 4— TTl ^ 

ca L.accaua.a 


/-> +- r-r /— f" =l 


24 00 


atcaagcaga 


uca u ucauag 


a ia a 55 


2 4d U 


taaagccata 


tat t taaaac 


t- t-ygaag y c c 


2 52 0 


atacaacaat 


gatggagatt 


cagagtatcg 


c o r\ 
2 DO U 


atgacgaaac 


atggaacatc 


gttataatat - 


2 d4 (J 


gaaacccagt 


ggaactatag 


tactgtaaca 


o T n n 


agcattatcc 


aatcctgatt 


tctgccaatc 


o -7 c n 


tcagctcaag 


atcatactac 


ctaattgcct 


2 82 0 


gataactgaa 


aaaagtaaca 


gagaaatagc 


1 O O A 

2 oo U 


c tgatatgta 


tgtagtctaa 


caataataaa 


O Q A n 
z y 4i u 


agcaagttag 


tcagaaaaca 


tcacagccaa 


"anon 
J U U U 


ataaaactaa 


atttaatgta 


atctgactta 




ctatataaac 


atgcagtctt 


tctttccctc 


312 0 


c tcaaaagtg 


aaatgtcttg 


attctcagct 


3180 


acatacaagg 


ccacgcaagc 


aaccaagttc 


3240 


cataacctct 


aacttcttct 


ggtaaataca 


3300 


caccatcact 


cttctcctta 


tcatctttct 


3360 


acgacattcc 


acaaattaat 


ctgtaattcc 


3420 
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aaccaacacc aaaaaacttc tcctgatgca attctcttcc tttactccat acttggtaat 3480 

tatcattcca tgaaggataa cacttagtga aaggatttgt gtaatgggta gtcacaggat 3540 

tggacaagga tttatgttgt gattgcaaaa gagcagagga agaagatgga gttacggaga 3 6 00 

cggaagattt caacaaccgt cttgaaacac gggagagccc aaaaaacgcc atctttgaga 3 660 

5 gaaattgttg cctggaagaa acaaagactt gagatttcaa acgtaagtga attcttacga 3720 

acgaaagcta acttctcaag agaatcagat tagtgattcc tcaaaaacaa acaaaactat 37 80 

ctaatttcag tttcgagtga tgaagcctta agaatctaga acctccatgg cgtttctaat 3840 

ctctcagaga taatcgaatt ccttaaacaa tcaaagctta gaaagagaag aacaacaaca 3900 

acaacaaaaa aaatcagatt aacaaccgac cagagagcaa cgacgacgcc ggcgagaaag 3960 

10 agcacgtcgt ctcggagcaa gacttcttct ccagtaaccc ggatggatcg ttaatgggcc 4020 

tgtagattat tatatttggg ccgaaacaat tgggtcagca aaaacttggg ggataatgaa 40 80 

gaaacacgta cagtatgcat ttaggctcca aattaattgg ccatataatt cgaatcagat 4140 

aaactaatca acccctacct tacttatttc tcactgtttt tatttctacc ttagtagttg 4200 

aagaaacact tttatttatc ttttcgggac ccaaatttga taggatcggg ccattactca 4260 

15 tgagcgtcag acacatatta gccttatcag attagtgggg taaggttttt ttaattcggt 4320 

aagaagcaac aatcaatgtc ggagaaatta aagaatctgc atgggcgtgg cgtgatgata 43 80 

tgtgcatatg gagtcagttg ccgatcatat ataactattt ataaactaca tataaagact 4440 

actaatagat 4450 



20 <210> 93 

<211> 2850 
<212> DNA 

<213> Arabidopsis sp 



25 <400> 93 

aattaaaatt tgagcggtct aaaccattag accgtttaga gatccctcca acccaaaata 60 

gtcgattttc acgtcttgaa catatattgg gccttaatct gtgtggttag taaagacttt 120 

tattggtcaa agaaaaacaa ccatggccca acatgttgat acttttattt aattatacaa 180 

gtacccctga attctctgaa atatatttga ttgacccaga tattaatttt aattatcatt 240 

3 0 tcctgtaaaa gtgaaggagt caccgtgact cgtcgtaatc tgaaaccaat ctgttcatat 300 

gatgaagaag tttctctcgt tctcctccaa cgcgtagaaa attctgacgg cttaacgatg 360 

tggcgaagat ctgttgttta tcgtttctct tcaagaatct ctgtttcttc ttcgttacca '420 
aaccctagac tgattccttg gtcccgcgaa ttatgtgccg ttaatagctt ctcccagcct . 480 

ccggtctcga cggaatcaac tgctaagtta gggatcactg gtgttagatc tgatgccaat 54 0 

3 5 cgagtttttg ccactgctac tgccgccgct acagctacag ctaccaccgg tgagatttcg 600 

tctagagttg cggctttggc tggattaggg catcactacg ctcgttgtta ttgggagctt 660 

tctaaagcta aacttaggta tgtgtttact tttcttttct catgaaaaat ctgaaaattt 720 

ccaattgttg gattcttaaa ttctcatttg ttttatggtt gtagtatgct tgtggttgca 780 

acttctggaa ctgggtatat tctgggtacg ggaaatgctg caattagctt cccggggctt 84 0 

40 tgttacacat gtgcaggaac catgatgatt gctgcatctg ctaattcctt gaatcaggtc 900 

attgaaatgt tgagaagttc ataaatttcg aatccttgtt gtgtttatgt agttgatctt 960 

gcttgcttat gtttatgtag ttgaaaagtit taaaaatttc taatccttgg tagttgatct 1020 

cgcttgtttg ttttttcatt ttctagattt tcgagataag caatgattct aagatgaaaa 1080 

gaacgatgct aaggccattg ccttcaggac gtaccagtgc tccacacgct gttgcatggg 1140 

45 ctactattgc tggtgcttct ggtgcttgtt tgttggccag caaggtgaat gtttgttttt 1200 

ttatatgtga tttctttgtt ttatgaatgg gtgattgaga gattatggat ctaaactttt 1260 
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gcttccacga caaggttatt gcagactaat 
cttgtacttt atgcgtttgt ttatactccg 
gttggcgctg ttgttggtgc tatcccaccc 
ctttatttta gcagattctg ttttgttgga 
5 ttcaccaatt ctatgcttat ctattttgtg 
gatttcatac aattcgatga ttcttccagc 
tatggccctt gcacatctct gccgcaatga 
gtcatatgag attagaatgt ctccttccat 
gtggaatgat cagagtgtcc tagatagtgt 

10 atgagttctt tccgttagag ataaacattc 
acttctgatt ttgtttcttg gtaccttgtt 
ccgtcaggga agagaatagc agcagtggct 
ggtttcatcg cctatgactg tgagtcttgt 
tgcattgctg tatctgattt ttgctgttcc 

15 caagttggtt ttgcctcgaa tcaacacttc 
cattctaccg agaccggacc atgcataaag 
tccttcctgt tttcatgtct ggtcttcttc 
aactcgtaga agaagccgga ttaacaaatt 
gaaagaaacg tgtggctcaa cctccggtgg 

2 0 tcccagct.cc ttccttctac tctccatgat 
aacagaaatt aaaaaaaaaa tctgaaaagt 
gtggagaacg catacaagtt tatgtatttt 
ctaagtatgt ttcaaatgat acaaaataca 
aatttttgag ctttgacgtg ttaggtctat 

25 gaaatgaaat ccgataaccg atgatggtgt 
aggtctcgag tctcgacggc tgcggaaatc 
cacgaagatg gcgatgaggt tgaaatcaat 



atgttggctg 


ctggacttgc 


atctgccaat 


1320 


ttgaagcaac 


ttcaccctat 


caatacatgg 


13 80 


ttgc ttgggt 


aaattt ttgt 


tec tt t tctt 


144 U 


tac tgc t ttt 


aat tcaaaat 


gtagtcatgg 


1 c o ri 


cgtcgtcagg 


tgggcggcag 


cgtctggtca 


T c n 


tgctcctcac 


ttt tggcaga 


tacctcatt t 


T tf; o n 


ttatgcagc t 


ggagggtaag 


accatatggt 


T £1 Q n 


gtagtgt tga 


tcttgaacta 


gttcaatt tc 


1 / 4 U 


cacagcagtc 


gacattttag 


tggctagata 


•y a r\ r\ 


gcgaacattg 


tttccagctt 


ccgcgaccca 


1 o ^ 


ttcagttaca 


agatgttgtc 


actc tttgat 


1920 


ctaaggaac t 


gcttttacat 


gatccctctc 


1980 


agattcatc t 


ttttttcgca 


gtttattgac 


2040 


A— ^ ^ ^ ^ W ^ 

ttccaatct t 


tgtgacaggg 


gggttaacct 


2100 


tcacactagc 


aatcgctgca 


acagcatttt 


2160 


caaggaaaat 


gttccatgcc 


agtcttctct 


2220 


tacaccgtgt 


ctctaatgat 


aatcagcaac 


2280 


ctgtatctgg 


tgaagtcaaa 


actcagaggc 


2340 


cttatgcctc 


tgctgcaccg 


tttcctttcc 


2400 


aacctttaag 


caagctattg 


aatttttgga 


. 2460 


tcttaagttt 


aatctttggt 


taataatgaa 


2520 


ttctcatctc 


cacataattg 


tattttttct 


2580 


tactttatca 


attatctgat 


caaattgatg 


2640 


ctaataaacg 


tagtaacgaa 


tttggttttg 


2700 


agagttaaac 


gattaaaccg 


ggttggttaa 


2760 


ggaaaatcac 


gattgaggac 


tttgagctgc 


2820 
2850 



<210> 94 
30 <211> 3660 
<212> DNA 

<213> Arabidopsis sp 



<400> 94 

35 tatttgtatt tttattgtta aattttatga tttcacccgg tatatatcat cccatattaa 60 

tattagattt attttttggg ctttatttgg gttttcgatt taaactgggc ccattctgct 120 

tcaatgaaac cctaatgggt tttgtttggg ctttggattt aaaccgggcc cattctgctt " 180 

caatgaaggt cctttgtcca acaaaactaa catccgacac aactagtatt gccaagagga 240 

tcgtgccaca tggcagttat tgaatcaaag gccgccaaaa ctgtaacgta gacattactt 3 00 

4 0 atctccggta acggacaacc actcgtttcc cgaaacagca actcacagac tcacaccact 3 60 

ccagtctccg gcttaactac caccagagac gattctctct tccgtcggtt ctatgacttc 420 

gattctcaac actgtctcca ccatccactc ttccagagtt acctccgtcg atcgagtcgg 480 

agtcctctct cttcggaatt cggattccgt tgagttcact cgccggcgtt ctggtttctc 540 

gacgttgatc tacgaatcac ccggtagtta gcattctgtt ggatagattg atgaatgttt 600 

45 tcttcgattt tttttttact gatcttgttg tggatctctc gtagggcgga gatttgttgt 660 

gcgtgcggcg gagactgata ctgataaagg tatgattttt tagttgtttt tattttctct 720 
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ctcttcaaaa ttctcttttc aaacactgtg 

cagacacctg acaaggcacc agccggtggt 

ggagcatctc aagaaactgt aattttgttc 

ttgtggataa tgatgtgtta gtttaggaat 

5 caagtcttgt tttcagctta gaaatgatgt 

tgttgttata ttctgtattc agaataaatg 

cacttggcct ccactggttt ggggagtcgt 

gaacctcttt tggatcatgc aatactgtac 

ttcttctggc agggaacttt cattggaccc 

10 tgatgatgtc tggtccttgt cttactggct 

gctgacttgt tcttattcta gtgcatttgc 

ttccagacaa tcaacgactg gtatgataga 

ccaattccat ctggagcaat atcagagcca 

ttatctcttt tgtgattctg atttctcctt 

15 ctgggtgcta ttattgggag gtcttggtat 

gcccttctga cattaactag tacagttaaa 

tatcaggcag ggcataccac tcccactgtc 

tcttatatat actctgctcc acctcttaag 

gtgataagac actccatcca agttttggag 

20 ttgcagctaa aacaaaatgg atgggttgga 

ttgccatggt aagatatctc gtgtatcaat 

ttgtttcttg ctcacttgac tgataggtgg 

ccagatgttg ttgttctaac actcttgtac 

ttttatgttg cttttttcgt tatctgttgt 

25 tgataatgca gttaggaata gccattgtta 

cattaggact tcagtctctc ccagtagctt 

ttggtgctat agacattact cagctttctg 

gcagctgtgg cttctatttc ttttccttga 

agcacaaatt aatgaagctg aatcaacaaa 

30 tgagctaatg aagaggaggc atctactttt 

atttcatgct tctaaaacaa gtattttcaa 

tcatttgtac ttttactagt ggatgagtta 

gtagagatca tcattagtat atgtctattt 

aaccttatta tgcgttggcg ttggttgctt 

3 5 agacgttaac agtctcacat tataattaat 

ctcgcttcta taaactgcag tttaaatact 

agtaccaggt aagtcaactt agtacacatg 

tctct.taatc agaagttgct tgaaacactc 

cttggtgctc ggaatatttg taacggcatt 

40 gatggggttt tgtcgaaagc agaggtgttg 

aactagttta aaagattttg taaaatgtat 

gtatcaattt agcaaaacgg ctgagaaatt 

ttttgcattt cctgctcata tcgaggattg 

tttcagaatg tttttgtttt ctgtagtgga 

45 attctaaaca tgtatccaca taaaaacagt 

tttataatct aaatctaaca actagctagt 



gcgtttgaat 


t tccgacggc 


agt caaatc t 


780 


tcaagcat ta 


accagc t tc t 


cggtatcaaa 


84 0 


ate tcctcag 


aatct t ttaa 


at tatcatat 


900 


t ttcccac ta 


aaggtaatc t 


ctt ttgagga 


C C\ 

y o u 


gaaaatgt tg 


t c c g c u age L. 


aaaaagag 1 1 


1 n o n 


gaagat tcgt 


cttcagct ta 


caaaaccagt 


108 0 


ccgtggcgc t 


gctgc t tcag 


gtaatca tac 


X±4 U 


agaaagt 1 1 1 


cccaucuccc 


ttccaat tgt 


T o n n 
U U 


cagaggatgt 


tgc taagtcg 


at tctt tgca 


1 o c n 


atacacaggt 


ctggttttac 


acaacaaaaa 


T O O A 
1 J ^ U 


ttggtgctac 


aataacctag 


ac t tg tcga t 


1 T O A 
X J O U 


ga tatcgacg 


caattaatga 


gccatatcgt 


144 0 


gaggtaac tg 


agacagaaca 


ttgtgagctt 


1500 


actcct taaa 


atgcaggtta 


ttacacaagt 


1560 


tgctggaata 


ttagatgtgt 


gggtaagttg 


162 0 


gggcacatca 


gatttgctaa 


aatcttccct 


168 0 


ttctatcttg 


ctttgggagg 


atcattgcta 


174 0 


gtaagt ttta 


ttcctaactt 


ccactctcta 


. 1800 


ttttgaatat 


cgatatctga 


actgat c tea 


1860 


aattttgcac 


t tggagcaag 


etatattagt 


1920 


aatatatggc 


gttgttccca 


tctcat tgat 


19 80 


gctggccaag 


cattgtttgg 


cactct tacg 


2040 


agcatagctg 


gggtactctt 


ttggcaaacc 


2100 


aatatgctct 


tgc ttcatgt 


tgtacc tttg 


2160 


acgact tcaa 


aagtgt tgaa 


ggagatagag 


222 0 


t tggcaccga 


aactgcaaaa 


tggatatgcg 


2280 


ttgccggtat 


gtac tatcca 


c tgttt ttgt 


2340 


tcttatcaac 


tggatattca 


eeaatggtaa 


2400 


ggcaaaacat 


aaaagtacat 


tc taatgaaa 


24o0 


atgt t teat t 


agtgtgat tg 


s^tggat 1 1 1 c 


252 0 


cagtgtcatg 


aaataacaga 


acttata tc t 


T c o r\ 


cacaat cat t 


gttatagaac 


caaat caaag 


2 t>4 U 


cggc ugcagg 


acauct.at.ua 


gcatctggga . 


O T A n 
2 / UU 


tgatca t tec 


t caga t tg t g 


t tecaggtaa 


O "7 C A 

2 / b U 


caaat t c ttg 


tcac tcgtc t 


gat tgc tac a 


O Q O A 

2 o2 U 


u cc ucaagga 


ccc ugucaaa 


tacgacgtca 


O Q Q A 
Z ti {5 U 


ut-cgcgcccu 


u L. cgaaaua t. 


ctt tgagagg 


O Q /I A 

2 y 4 u 


clUL.I_t-gctI_L.a. 


2 3 ^ 
d a.g g c a. c« y 


^— ^ />w 0^ ^ 

vjg (-.agcjcjo. u c 


J u u u 


ag c a u c g \j a a. 


^ ^ 2 ^ ^ 

ccic ugaaadd 


ggcgca I- c u c 


'3 A C A 
J U D U 


acacatcaaa 


tgtgggcaag 


tgat ggcat c 


J 12 U 


gcaccgcuau 


tac tagaaac 


aactcc tgt t 


J X O U 


gtaattgatg 


ttaccgtatt 


tgcgctccat 


3240 


gggtttatgt 


tagttctgtc 


acttctctgc 


3300 


ttttaactat 


tttcatcact 


ttttgtattg 


- 3360 


aatatacaaa 


aatgatactt 


cctcaaactt 


3420 


aacccaacta 


acttcataca 


attaatttga 


3480 
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gaaactacaa agactagact atacatatgt tatttaacaa cttgaaaccg tgttattact 3540 
acctgatttt tttctattct acagccattt gatatgctgc aatcttaaca tatcaagtct 3600 
cacgttgttg gacacaacat actatcacaa gtaagacacg aagtaaaacc aaccggcaac 3660 
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